[HLUG] Open Source Databases

Fri Nov 16 16:07:14 UTC 2012

Hi Everybody

First thank you Jools, Kevin and Mark - your input has been a great help.

Perhaps to take it a bit further can I give a bit more detail. The 
current project includes the resurrection of method and tables from the 
1990's which I can do with existing old SuperBase (Am amazed they are 
back in business but glad - it was a very useful stable device, trouble 
is that current owners don't plan on upgrading the Developer edition to 
64bit and license fees are now payable to a third party and seemingly 
depending partly on Windows development. I could be wrong but I think I 
should give it a miss - sad.)

The plan is to redevelop the method and combine the best characteristics 
of the old databases which were essentially flat-file and, in view of 
the complexity of the research field and its changing nature, move to a 
relational model so as to maximise on flexibility and interrogation 
capabilities. My guess therefore is that the nosql family is largely out 
- a pity because e.g. Riak looks to be good and well supported.

As for volume, there are usually anything up to 30 tables with some 
having say up to 20 fields (columns) and at present about 100,000 rows. 
In the future many more rows are envisaged.

I think your advice seems to suggest a three stage operation - Stage 1 
Import data (which sadly comes in all forms and formats) as a flat file 
into a reasonably powerful data manipulation program (Any suggestions 
for what this kind of program is called) that can fracture, validate and 
re-format the contents into the table structure required by the main 
engine and export as .csv files. This would give the cell by cell 
examination I was speaking of. Stage 2 Import into the main RDMS (say 
MySQL in view of user base, forum support, likely longevity, open 
source) and Stage 3 Export query output again as .csv into a data 
visualisation or specialised report packages.

Libre Office Base can do some of this through Calc but I sense probably 
isn't as good as the above could be with a dedicated system for each 
stage, Kexi also looks very interesting and there is good support for 
scripting but I guess that too is a bit young for this kind of work.

Am I thinking along the right lines or have I now lost it completely?

Many thanks

Chris

On 14/11/12 20:33, Mark Broadbent wrote:
> On Tuesday, 13 November 2012, Dr Chris Owens wrote:
>
>> Hi All,
>>
>> Please can I pick the brains of the group for a bit of advice.
>>
>> I have been asked to re-start work from 15-20 years ago on Healthcare
>> content and logistics, trouble is it was all done with SuperBase which has
>> long gone bust and won't in any case run on modern machines.
>>
>> The key ingredient SB had was an integrated provision to write scripts
>> easily (in a form of Basic) that by-passed GUI services (especially for
>> Forms and Output) and gave easy and very tight control of cursor movement
>> to rows and after a bit of fiddling to columns apart from customising Forms
>> and Output. This meant the content of every cell was easily accessed which
>> is my first requirement. (I could be wrong, but using SQL in e.g. MySql
>> would be a hard way of managing this sort of access with or without content
>> manipulation).
>>
>> Two other key requirements are a strong import and export facility that
>> can handle as many  (esp proprietary) formats as possible with good
>> filtering capability and the system needs a good capacity for volume. Speed
>> is not critical and finally I would like it to be Open Source.
>>
>> So I suppose what is really needed is a dogged stable facility without too
>> many bells and whistles that is reasonably easy to use and can act as a
>> processing engine - any sophisticated data visualisation capability would
>> just be a bonus.
> Hi Chris,
>
> Your right that using SQL might not be ideal for this type data, however
> there are now a huge range of databases that offer different ways of
> searching and mapping data.  I use Riak which allows data to be stored in
> JSON format which can be indexed and searched using JavaScript (this being
> an example of a NoSQL database where there isn't strict schemes in place by
> default).  However there are plenty of other choices dependant on how you
> need to read and write the data.
>
> For your import/export requirement I'm not sure your going to find many
> databases that can import many formats, this would generally be left to an
> external program to map the input into the scheme of your database in my
> experience.  There is plenty of choice out there in the open source world
> for DBs, the hard bit is choosing the right one, particularly as you
> mention volume of data to be stored as a requirement.  What sort of volumes
> are we talking?  I'm guessing that if the project is 10-15 years old the
> volume is not in the terabytes?
>
> Thanks
> Mark