Dictionary Design

Printable View

September 2nd, 2004, 03:45 AM
galatasarayer

Dictionary Design

I am trying to make a Turkish-Spanish dictionary. I couldn't make a decision about how to keep data in files. Can you give me any advices about how to keep files? I have made decision about lookup in memory. Thanks in advance
September 2nd, 2004, 05:14 AM
Yves M

Re: Dictionary Design

You could dump your memory to disk, if it's in a format that can be written as is (e.g. a FSM), you could use a database (this will allow you not to have to read everything into memory at run-time) or you could use a simple text file (which will be longer to read, but the user can edit it then).
September 2nd, 2004, 06:53 AM
mehdi62b

Re: Dictionary Design

IMO,working with databases is much easier but I have a question here,can you tell me what you mean from database...
I mean how I can write a software when its installed every time could install our word database on every client computer ....
(i.e in ASP.NET we have only one database copied to the server and every request could be handled through DBMS of the server computer but here we should install the databse on every client computer...
can someone make it clear for me,am I wrong completely?)
Thanks in advance.

--------------------
Mehdi.:)
September 2nd, 2004, 07:59 AM
Yves M

Re: Dictionary Design

On Windows, you can always use Jet databases through DAO or ADO. This should be installed on any recent Windows computer.
September 2nd, 2004, 08:53 AM
Joe Nellis

Re: Dictionary Design

I have found that word dictionaries are best handled by "directed acyclic word graphs." They provide very fast lookup and can be compressed enough so that you can store both the Spanish and Turkish language in memory without going to disk. If you are intending to put in definitions, usages, and that stuff though then it may be easier to use a database (but not faster).
September 2nd, 2004, 09:09 AM
mehdi62b

Re: Dictionary Design

yes,you are right ....
consider,we wan to use MSsqlserver here(a non jet database) ...
can you tell me How its possible? ...
Tnx

------------------
Mehdi.:)
September 3rd, 2004, 03:44 PM
mehdi62b

Re: Dictionary Design

Quote:

Originally Posted by Yves M

You could dump your memory to disk, if it's in a format that can be written as is (e.g. a FSM), you could use a database (this will allow you not to have to read everything into memory at run-time) or you could use a simple text file (which will be longer to read, but the user can edit it then).

Hello Yves,
could you tell me what you mean from dumping memory,do you mean i.e I can use Stream and StreamReader objects for buffering my database file while reading it or you meant something else?!,
could you help me or point me to a link or tip?!
thank you very much in advance.

---------------
Mehdi.:)
September 5th, 2004, 07:07 PM
RoboTact

Re: Dictionary Design

I think that anyway the best way out would be the manually written engine or some dictionary-specific simple engine, because any dictionary is a simple structure and if you write it manually (and properly!) you will receive the greater speed and no need in support (and purchase) for external systems like databases or any general technology.

Indexing may be done in two-three implementations at once: binary alphabetic tree for positioning the dictionary on entering first chars, hash for quick search and inexact search. Of cause all info would be stored on disk without loading into memory, indexes just store some information to get that info from file. File structure depends on if you need to change a large number of words runtime: if it is, you should provide some safe saving method (clustering of file would be OK). Hash and binary tree can be updated in no time.
September 8th, 2004, 04:42 AM
mehdi62b

Re: Dictionary Design

Thank you very much for your reply
I found a good explanation in this book,
fundamentals of datastructures by Ellis Horowitz :chapter 8:Hashing

-------------------
Mehdi.:)
January 3rd, 2005, 04:46 PM
Erebus

Re: Dictionary Design

Quote:

Originally Posted by galatasarayer

I am trying to make a Turkish-Spanish dictionary. I couldn't make a decision about how to keep data in files. Can you give me any advices about how to keep files? I have made decision about lookup in memory. Thanks in advance

FSM is a good Idea for store dictionary entries (and in your case the entry translation as definition). If you chose this way, I advice you use s_fsa toolkit develop by Dan daciuk. You can build an minimal automa. You can also deal with diacritics errors ...

http://juggernaut.eti.pg.gda.pl/~jandac/ for more informations