[HLUG] Aduna Autofocus Meta Search software

Steve Bushell steve at stevebushell.com
Tue Aug 28 20:01:38 BST 2007



-----Original Message-----
From: "Julian Robbins" <joolsr at fastmail.fm>
To: "Herefordshire Linux Users Group." <herefordshire at mailman.lug.org.uk>
Sent: 26/08/07 23:55
Subject: Re: [HLUG] Aduna Autofocus Meta Search software

Andrew Hodgson wrote:
> Hi,
> 
> On a standard server we would have several shares, and some of the files on the shares would only be accessible to specific people in the team.  With one of these tools going through the contents of the server, if it has access to the file, how do we stop the end user getting this file back as a search result?  True, they won't be able to see the file, but they could possibly view a preview of the file, or the phrases that matched the search.
> 
> Andrew.

Hi

Autofocus will work on a per share basis. You set up the server with 
'Sources', ie file shares, or a web source. You can set up as many as 
you like, plus how often you want these to be updated. The initial 
indexing takes a long while, but better than Google desktop I think. 
Subsequent reindexing is done by checking the modified date/ and time, 
so it will only re-index a file if its changed, resulting on much 
quicker indexing.

Then you have a web admin interface for what is displayed to your users. 
So out of these Sources, you create 'Profiles' . A profile is just a 
filter for what you can allow searches to be made on. IE if you have 4 
profiles setup, then your users can search on any of these profiles. I 
tended to use one Profile and one source per smb share. You can just map 
a single source over to a profile, ie a smb share called quotes can be a 
searchable profile called 'quotes'. Or you can add them all together. IE 
At work I have set-up 8 sources, some of which are mapped to individual 
profiles, but one profile called 'All' contains links to all of the 
sources. This one takes far longer to search through though. (I need 
more ram)

Luckily, all the set-up is made thru GUI web interfaces so its not as 
complicated as it sounds. Even setting up tomcat and installing 
Autofocus is a doddle. Its just understanding how sources relate to 
profiles which I found confusing to start with.

There are some detailed manuals that ship with the software. Aduna runs 
in java on top of apache tomcat. The manuals are well written and pretty 
useful. Don't forget that it cant (yet) work with databases, (well it 
can but as the guy at Aduna told me its more how difficult it is to get 
the data out into a decent searchable fashion). And if you want email, 
you will have to install the desktop version, (also OSS), which will 
index IMAP email but still work fine with your server version, to get 
all the smb shares).

I found some problems where certain files wont index properly, but 
that's hardly surprising with 175,000 files indexed. There's a few other 
oddities, like having a '&' in the path of files screws up the results, 
and gives an 'access denied error' (its to do with web URL paths I 
think, I just renamed our (poorly named) directories where this 
happened). But there's a neat form of a transformation matrix should you 
be using mapped drives. I didn't need that as we use UNC paths.

The only other thing to bear in mind is that since the web client runs 
in any browser, (fine on Linux BTW), that it needs the the file opening 
associations set up correctly on your web browser. IE it may find files 
in the results for you but you may not have the relevant app to open them.

But on the whole, as the software has been written by a company, its 
well documented and well thought out to use at Enterprise level.

Andrew, if you want a demo, and can come over to Leominster, I can show 
it to you running..

Julian

> 
> -----Original Message-----
> From: herefordshire-bounces at mailman.lug.org.uk [mailto:herefordshire-bounces at mailman.lug.org.uk] On Behalf Of Julian Robbins
> Sent: 26 August 2007 19:38
> To: Herefordshire Linux Users Group.
> Subject: Re: [HLUG] Aduna Autofocus Meta Search software
> 
> Andrew Hodgson wrote:
>> Julian,
>>
>> How does this access the files on your server, and how does it ensure that the relevant people can see the relevant files, and not for example files that they should not have access to ordinarily?
>>
>> Andrew.
> 
> Via smb shares, or of a Windows server if you have one. Access is a bit 
> more tricky, I guess the easiest way is to use a guest account with 
> limited access, as I dont think its configurable for individual user 
> access.
> 
> http://www.aduna-software.com/products/autofocus_server/overview.view
> 
> Interestingly, they is also a desktop software from Aduna that as well 
> as being able to access the server you have setup allows much more 
> visualisation of keywords search terms and how they inter-relate and 
> usually IMAP email searching in amongst the results. The server software 
> also can index and present search results from websites too. Our 
> www.q-par.com site is also a source for content on our server.
> 
> I was also impressed by Aduna's attitude to the software. Its all using 
> OSS software , but they only make money from services or customisations. 
> They dont have a premier version that costs more ... You get the full 
> version for free
> 
> Julian
> 
>> -----Original Message-----
>> From: herefordshire-bounces at mailman.lug.org.uk [mailto:herefordshire-bounces at mailman.lug.org.uk] On Behalf Of Julian Robbins
>> Sent: 26 August 2007 10:23
>> To: Herefordshire Linux Users Group.
>> Subject: Re: [HLUG] Aduna Autofocus Meta Search software
>>
>> Julian Robbins wrote:
>>
>> Re the Autofocus software below.
>>
>> I'm still pleased with the operation, interface and usability of this 
>> code now its been more thoroughly tested at work.
>>
>> The only problem has been the speed, understandably. I think i need more 
>> than the 2GB of ram i have. Getting the 175,000 files its indexed on our 
>> server needs a lot of ram, and its maxing out at more 90% usage.
>>
>> So at the moment, its working well, apart from the time it takes to 
>> search the index for the largest profile with this 175,000 files. With 
>> only 10000 files its extremely fast.
>>
>> Mind you for a 300000 files Google Mini search box, its £6900 which 
>> includes two years support and the hardware, so if you've server space 
>> and lots of RAM, its still well worth it.
>>
>> Not too bad to set up either, once you've got Apache and Tomcat setup. 
>> The trick is not to use the tomcat from the Ubuntu Repos but to use the 
>> notes on the Ubuntu forum showing how to setup Tomcat 6. Then its plain 
>> sailing ....
>>
>> I'll tell more at a forthcoming LUG meeting, assuming we're still up for 
>> our presentation to the Green part and the FoE too in September ..
>>
>> Whats going on with this by the way. can we get a list of members who 
>> i'd like be involved ? Please reply to this email if you're interested ..
>>
>> Julian
>>
>>
>>
>>
>>
>>> Hi
>>>
>>> A Real Linux posting now!!
>>>
>>> I have been intending on putting together a web based meta search 
>>> facility of all our company information, files, database and emails, to 
>>> make a huge searchable repository of information, a bit like Google but 
>>> just inside the company I work for.
>>>
>>> Although Beagle, Tracker, Strigi and Recoll,
>>>
>>> http://www.freesoftwaremagazine.com/blogs/desktop_search_beagle_part_1
>>>
>>> have many plaudits as desktop search tools that are actually useful 
>>> every day - they lack a means to allow this to be easily rolled out in a 
>>> company. Instead I wanted a single server that could serve searches via 
>>> a std web browser.
>>>
>>> After lots of looking I came across a PHP script for Beagle, called 
>>> 'Peagle'. This was ok, but wasn't being actively developed, and would 
>>> have needed work to get it right.
>>>
>>> After quite a bit of searching I accidentally stumbled upon something 
>>> called Aduna Autofocus. They are a Dutch company offering exactly what I 
>>> was looking for . www.aduna-software.com
>>>
>>> The software is all based on Open Source technologies (Spectacle, 
>>> Sesame), and runs on Java and Apache Tomcat. After a few irritations, I 
>>> got Tomcat installed, and the .war file slotted into the Tomcat 
>>> webserver fine.
>>>
>>> The Autofocus server is setup via a GUI, with decent manuals to help 
>>> you. The web search interface is clean, and really well thought out - as 
>>> it has a couple of clever techniques to help you drill down through the 
>>> data that is returned, via a list of suggestions, and filters based, on 
>>> file type, Author, date, size, type of source etc.
>>>
>>> I wont bore you with the details, but so far, I am extremely impressed 
>>> with the software. Aduna too have their ideals very much in FOSS, and 
>>> make money from support and developing custom solutions rather than 
>>> offering a premiun product at a high cost that is not under a Open 
>>> Source Licence.
>>>
>>> So, if you want to get a Google search appliance for your company but 
>>> cant afford it, you may want to try this instead. Its not perfect but is 
>>> mature, works well coupled with other Open Source technologies should 
>>> you want to develop it further.
>>>
>>> All I need is more RAM to test it with !!!
>>>
>>> Cheers
>>>
>>> Julian
>>>
>>> _______________________________________________
>>>

_______________________________________________
Herefordshire mailing list
Herefordshire at mailman.lug.org.uk
https://mailman.lug.org.uk/mailman/listinfo/herefordshire



More information about the Herefordshire mailing list