CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 6 of 6

Thread: Search Engine

  1. #1
    Join Date
    Sep 2020
    Posts
    3

    Search Engine

    If we were to make a search engine using Java but at a smaller scale, how would we go about it?

    Assuming a dataset is already provided for the needful and we are expected to dump the content from the external file into data structures within the Java environment like:

    a. Hashtable
    b. LinkedList
    c. ArrayList

    Once the data is read into the data structures, generic search algorithms to be used to to test the search capability for the dataset:

    a. Sequential Search for ArrayList and LinkedList.
    b. Built-in Binary Search for ArrayList and LinkedList.
    c. Built-in HashTable Search for the Hashtable


    IDE: Eclipse IDE 2020‑06
    File source: https://drive.google.com/file/d/1EyG...ew?usp=sharing

  2. #2
    Join Date
    Feb 2017
    Posts
    677

    Re: Search Engine

    Quote Originally Posted by kchad2 View Post
    how would we go about it
    Regarding b (Built-in Binary Search): Note that a binary search works on random access arrays with items in sorted order. It means the ArrayList must be sorted, and the LinkedList will not work at all (at least not with O(logN) complexity as is expected).

    If the number of records in the data set you posted is typical (around 10.000 records) it is actually quite small by today's standard. It simplifies things. You can use an ArrayLists as database supported by two main functions, extract and sort.

    You load the data set row by row as objects (of a class called for example Record) into an ArrayList (the database). The extract function is called with a selection criterion. It walks sequentially through an input ArrayList (typically but not necessarily the database) and selects the Record objects that fit the criterion and returns them in an output ArrayList. The sort function is called with a sort criterion. It sorts the Records of an ArrayList in place according to the criterion.

    To update a Record in the ArrayList database the Record can be modified in place or be replaced by a new Record at the same index position. New Records are added at the end of the ArrayList. To remove a Record it is replaced by the last Record of the ArrayList.

    This is sufficient for simple queries and basic data presentations. If you need to do something more complex you may have to add secondary indexes but this can be done gradually when the need arises.

    The approach I suggest is very similar to a so called flat file database, only the flat file is replaced by an ArrayList in memory. The Arraylist has no specific internal organization, it's just an arbitrary sequence of Records.
    Last edited by wolle; September 11th, 2020 at 12:27 AM.

  3. #3
    Join Date
    Sep 2020
    Posts
    3

    Re: Search Engine

    I have done the code this far. What I don't understand is how best I could do the following:

    1. Separate the main from the other classes
    2. Add data from the .csv file into hashtable
    3. Implement the search algorithms as mentioned in #1

    Code:
    import java.io.BufferedReader;
    import java.io.FileNotFoundException;
    import java.io.FileReader;
    import java.io.IOException;
    import java.util.ArrayList;
    import java.util.LinkedList;
    
    public class CSVReader {
    	
    	public static void main(String[] args) {
    		
    		String path = "C:\\books.csv";
    		String line = "";
    		ArrayList <String> aList = new ArrayList<String>();	
    		LinkedList <String> lList = new LinkedList<String>();
    //		HashTable <String, String> hList = new HashTable <String, String>();
    		
    		try {
    			@SuppressWarnings("resource")
    			BufferedReader br = new BufferedReader(new FileReader(path));
    			
    			while ((line = br.readLine()) !=null){
    				String record = line;
    				   aList.add(record);
    				   lList.add(record);
    //				   hList.add(record);
    				} 
    		} catch (FileNotFoundException e) {
    			e.printStackTrace();
    		}
    		catch (IOException e) {
    			e.printStackTrace();
    		}
    	}
    
    }

  4. #4
    Join Date
    Feb 2017
    Posts
    677

    Re: Search Engine

    Quote Originally Posted by kchad2 View Post
    I have done the code this far. What I don't understand is how best I could do the following:
    So your main problem is that you lack basic Java programming skills right, and now want someone to do the job for you?

    Still my advice stands - keep it simple! Base your database on an ArrayList supported by sequential extraction and sorting.

    Good Luck.
    Last edited by wolle; September 11th, 2020 at 12:39 AM.

  5. #5
    Join Date
    Sep 2020
    Posts
    3

    Re: Search Engine

    Quote Originally Posted by wolle View Post
    So your main problem is that you lack basic Java programming skills right, and now want someone to do the job for you?
    Not really. I just need someone to guide me with my code. I have already started, and the code I posted is error and warning free. But separating classes, where one class reads a file and other class assigns the content of the file into a specific data structure is a bit challenging for me. I have a C++ background, where we studied late objects, thats my biggest weakness.

    If there classes were straight forward, I could have done it myself. As far as Hashtables go, I have very less knowledge, if it were HashMaps, I could have done it since I have worked with Maps.

  6. #6
    Join Date
    Feb 2017
    Posts
    677

    Re: Search Engine

    Quote Originally Posted by kchad2 View Post
    I have a C++ background
    I wouldn't say Java is totally different from C++ but there are substantial differences. Java has a much stronger object oriented (OO) feel to it, there's a garbage collector, all objects are allocated on the heap and handled by pointers although this isn't explicit, all classes inherit a common superclass called Object, etcetera. Coding Java like non-OO C++ will result in a bad Java program. I suggest you work through a couple of Java tutorials to get the hang of it,

    https://docs.oracle.com/javase/tutorial/

    You could use a tutorial example as template for your own code, especially the program entry point main(). This is also where you set up a GUI if you want one.

    Java has a direct counterpart to most containers in the C++ standard library. ArrayList corresponds to vector, LinkedList to list, TreeMap to map, HashMap and Hashtable (obsolete) to unordered_map, TreeSet to set, HashSet to unordered_set, etcetera. But the correspondence is mostly at the data structure level, usage differs remarkably. Again I suggest you take a tutorial and study example code.

    Please note the comment I made about binary searches at the top of post #2 above.

    Finally I think you should seriously consider the flat file inspired database I suggested in #2. Simplicity is king!
    Last edited by wolle; September 15th, 2020 at 06:31 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured