Wednesday, 28 August 2013

Idea on building a live, serverless multi-user local-network-shared database with SQLite

Idea on building a live, serverless multi-user local-network-shared
database with SQLite

I need some input on a idea I had. If you're interested/experienced in
SQLite and lowest-budget-high-end-solutions (Win only for now ;)) this is
for you. While fleshing out this question I have gotten pretty excited
about this idea, so bear with me. There's a TL;DR below
I have developed a VB.NET (3.5) Desktop application with SQLite. The data
is structured in an EAV-model, because I need that flexibility (the
semantic structure of the data is stored separately in an XML File). The
application actually performs very well above my expectations, even with a
- for the scenario at hand - large database (about 120MB for the main
file).
As expected performance is abysmal when the SQLite file(s) are on a
network folder. It's bearable but I have higher goals. Expected scenarios
(for now) are max. 10 users that need to access the database concurrently
within a local windows network. 98% of the concurrence is required for
heavy read-access, inserts and updates are sparse and small.
The software is used almost exclusively within environments with low
budget and the technical infrastructure (support and hardware) even lower,
so my goal is to avoid using a database server. I also do not want to
implement my own SQLite "Server" a la SQLitening (i.e. tell one instance
of the application to automagically "share" the database in the network)
because i want the database to be able to reside on an independent network
drive.
My first impulse after realizing the situation cannot be amended by
optimizing queries, was a "lazy synchronization" approach which would be
mostly painless to implement. The database is "wiki-esque" with (almost)
only inserts, so there won't be any (much) conflict issues at all: whoever
comes "last" wins, each field has a change history with timestamp and
userid and can be rolled back individually. Entries are "marked as
deleted" and can be discarded upon a "cleanup"-action. Though it comes at
the cost of never being "live" with the data other users change or enter
during a session, and also the synching process might take a while, with
the users potentially blocking each other at "rush-hour". We're talking
maybe a couple of minutes worst-case, which wouldn't be a big deal but it
wouldn't be cool, either.
TL;DR: How to implement a live, serverless, local-network-shared Database
with SQLite for a end-user application that already performs very well on
a local datafile within a scenario with
many SELECTs
few INSERTs
no DELETEs and
hardly any UPDATEs per user session.
Let's assume furthermore
there is always sufficient hard drive space available
due to archive mechanisms (and data privacy constraints) the database
never grows above 200MB
we can efficiently tell whether a file has been changed and by whom in the
"shared directory" where the database files reside
we can copy files sufficiently fast from the shared directory
Now what i had in mind was implementing differencing files for each
session that will be locally cached for read access.
At the start of a client-session:
check the big file and session-specific (see below) files in the shared
directory for changes since the last session (CRC+logfile in the shared
directory)
copy the big (200MB) current database file to a local path before each
session if changed or not cached
also copy all session-specific files (see below) if changed or not cached
all INSERTs during a session are written to a small, session-specific file
in the shared directory
this could be limited to a suitable size like 2MB per file or something
even smaller, then a new file is created
read-access (SELECTs) is then performed sequentially
in the local copy of the main file
in the local copies of the session files
in all current (ie. new) session files in the shared cache
on detection of a new session file, session files will by copied to the
local cache again
finally, the session files are periodically merged into the big file
this could be at the end of every session, but if I am not mistaken, it
could be whenever.
This would
eliminate all write concurrency
eliminate read concurrency on the big file and all local session files
reduce the needed concurrency to the reads on the small session files
minimize the network usage to accessing the most current session files
(2MB per concurrent user)
preserve a live view of the current data state for every client
This sounds beyond awesome to me.
Questions: Is there a name for this "protocol" that i am outlining so I
can do further research?
Would you consider this is a viable approach with SQLite or is this a wild
goose chase - am I overlooking obvious drawbacks?
If you're on board, what would be a good size for the session files (n *
page_size?)?
Thank you for your input!
Christoph

No comments:

Post a Comment