The core of the core of the big data solutions -- Map
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 6 of 6

Thread: The core of the core of the big data solutions -- Map

  1. #1
    Join Date
    Jun 2005
    Posts
    3

    The core of the core of the big data solutions -- Map

    Title: The core of the core of the big data solutions -- Map
    Author: pengwenwei
    Email: pww71@sina.com
    Language: c++
    Platform: Windows, linux
    Technology: Perfect hash algorithm
    Level: Advanced
    Description: Map algorithm with high performance
    Section MFC c++ map stl
    SubSection c++ algorithm
    License: (GPLv3)

    Download demo project - 1070 Kb
    Download source - 1070 Kb

    Introduction:
    For the c++ program, map is used everywhere.And bottleneck of program performance is often the performance of map.Especially in the case of large data,and the business association closely and unable to realize the data distribution and parallel processing condition.So the performance of map becomes the key technology.

    In the work experience with telecommunications industry and the information security industry, I was dealing with the big bottom data,especially the most complex information security industry data,all canít do without map.

    For example, IP table, MAC table, telephone number list, domain name resolution table, ID number table query, the Trojan horse virus characteristic code of cloud killing etc..

    The map of STL library using binary chop, its has the worst performance.Google Hash map has the optimal performance and memory at present, but it has repeated collision probability.Now the big data rarely use a collision probability map,especially relating to fees, canít be wrong.

    Now I put my algorithms out here,there are three kinds of map,after the build is Hash map.We can test the comparison,my algorithm has the zero probability of collision,but its performance is also better than the hash algorithm, even its ordinary performance has no much difference with Google.

    My algorithm is perfect hash algorithm,its key index and the principle of compression algorithm is out of the ordinary,the most important is a completely different structure,so the key index compression is fundamentally different.The most direct benefit for program is that for the original map need ten servers for solutions but now I only need one server.
    Declare: the code can not be used for commercial purposes, if for commercial applications,you can contact me with QQ 75293192.
    Download:
    https://sourceforge.net/projects/pww...h.zip/download

    Applications:
    First,modern warfare canít be without the mass of information query, if the query of enemy target information slows down a second, it could lead to the delaying fighter, leading to failure of the entire war. Information retrieval is inseparable from the map, if military products use pwwhashMap instead of the traditional map,you must be the winner.

    Scond,the performance of the router determines the surfing speed, just replace open source router code map for pwwHashMap, its speed can increase ten times.
    There are many tables to query and set in the router DHCP ptotocol,such as IP,Mac ,and all these are completed by map.But until now,all map are using STL liabrary,its performance is very low,and using the Hash map has error probability,so it can only use multi router packet dispersion treatment.If using pwwHashMap, you can save at least ten sets of equipment.

    Third,Hadoop is recognized as the big data solutions at present,and its most fundamental thing is super heavy use of the map,instead of SQL and table.Hadoop assumes the huge amounts of data so that the data is completely unable to move, people must carry on the data analysis in the local.But as long as the open source Hadoop code of the map changes into pwwHashMap, the performance will increase hundredfold without any problems.


    Background to this article that may be useful such as an introduction to the basic ideas presented:
    http://blog.csdn.net/chixinmuzi/article/details/1727195

  2. #2
    Join Date
    Jul 2013
    Posts
    541

    Re: The core of the core of the big data solutions -- Map

    Quote Originally Posted by pww71 View Post
    My algorithm is perfect hash algorithm,its key index and the principle of compression algorithm is out of the ordinary,the most important is a completely different structure,so the key index compression is fundamentally different.
    Perfect hashing comes at a cost and it will show up somewhere eventually. There is no free lunch.

    But by all means, if you have revolutionized hashing then I suggest you write a scientific article. If you can achieve that "the performance will increase hundredfold without any problems" I'm sure it won't be hard to get it published.
    Last edited by razzle; March 25th, 2015 at 01:43 AM.

  3. #3
    Join Date
    Apr 2000
    Location
    Belgium (Europe)
    Posts
    4,362

    Re: The core of the core of the big data solutions -- Map

    "big data" does not compute with "hash map". (hash functions... maybe, but doubtably with map keys)

    you also make weird leaps with modern warfare inferring that target acquisition has anything to do with big data. and/or that fighters have live connections with outside databses to make decisions. Ugh ?



    in hash maps the goal is to make a hash function as optimal as possible that allows fast computation of the hash as well as fast access, so you typically have short hashes (32bit, 64bit). You WILL get collisions and so you need a collision detection and resolution strategy (this is entirely unrelated to the hash algorithm). there are many approaches to collision resolution none is ideal they all have pro's and cons, finding the one that works best for you is part of the job of the developer.

    perfect hashes are only possible with static data. on dynamic data, it's impossible to make a "perfect" hash, you'll always have collisions.

    You also make claims about the use of STL's map that just aren't true. not everything you say uses std::map. std::map is not a hash based map, it's a key based map typically using a red/black tree.
    there's also a hash map in std::unordered_map (using buckets as collision resolution)

  4. #4
    Join Date
    Jun 2005
    Posts
    3

    Re: The core of the core of the big data solutions -- Map

    Quote Originally Posted by razzle View Post
    Perfect hashing comes at a cost and it will show up somewhere eventually. There is no free lunch.

    But by all means, if you have revolutionized hashing then I suggest you write a scientific article. If you can achieve that "the performance will increase hundredfold without any problems" I'm sure it won't be hard to get it published.
    My English is not good

    But with the help of Google translation

    I understand your meaning

    and i can't express the algorithm principle with the English

    I can only say that

    you can test my algorithm

    You will find

    my algorithm performance and memory is very good

  5. #5
    Join Date
    Jun 2005
    Posts
    3

    Re: The core of the core of the big data solutions -- Map

    My English is not good

    But with the help of Google translation

    I understand your meaning

    and i can't express the algorithm principle with the English

    I can only say that

    you can test my algorithm

    You will find

    my algorithm performance and memory is very good

  6. #6
    Join Date
    Jul 2013
    Posts
    541

    Re: The core of the core of the big data solutions -- Map

    Quote Originally Posted by pww71 View Post
    My English is not good

    But with the help of Google translation

    I understand your meaning

    and i can't express the algorithm principle with the English
    Too bad because this is an English language forum.

    And generally, if you don't know (written) English it's very hard to communicate with an international computer science audience.

    Anyway good luck and if you have some revolutionary findings I'm sure you'll find a way to get the good news out

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center