Numbering entries in a data file

**aonghas** · August 11th, 2010, 07:26 AM

Hi there,

I'm working on my contact list program and I've got to the point where I can add contacts with name(s), email and phone to a data file (in that order). For example ./addcontact John Baker john@domain.com 0211023090 will add that to a line in my contactlist.dat file with spaces in between the arguments. However I am required to have these contacts numbered. So if I was to add multiple contacts they would be listed like so:

1 John Baker john@domain.com 0211023090
2 Ben Parker ben@domain.com 29347728
3 Mary Beans mary@domain.com 2342342

and so on...

My question is how do I write my program to set the contact number before writing to the file? Later on when I write a program to remove a contact, say contact 2 the data file will have:

1 John Baker john@domain.com 0211023090

3 Mary Beans mary@domain.com 2342342

so when I add a new contact, I want the program to assign the next contact with number 2. Is there a way that I can open and read the contact list line by line searching for a break in the numbering to get a number, otherwise assign the next number after the last contact?

Thanks heaps,
Aonghas

**ahmd** · August 11th, 2010, 02:14 PM

Well, you have a typical fragmentation issue that most database engines (and a file system) has to deal with. My question to you, are you sure that you want to implement all this by yourself? You can probably benefit from using a built in mechanism in any of the database engines available, say, through MFC, in the form of DAO and ODBC database classes.

If you still want to do it yourself, one approach would be to parse your file into a List type array in memory, then organize that array according to your specifications and write it back into your text file. Or using that same memory array locate the smallest ID that is missing. All that may require some coding on your part.

**aonghas** · August 12th, 2010, 12:48 AM

Thanks for your reply. Okay turns out I don't actually have to have them ordered in the data file by number. I just need to write some code that looks at the numbered list through the whole data file and return the lowest free number. So for the above example, I could have:

1 John Baker john@domain.com 0211023090
3 Mary Beans mary@domain.com 2342342

and then when I add another contact, it would be assigned number 2 but could just be added to the bottom:

1 John Baker john@domain.com 0211023090
3 Mary Beans mary@domain.com 2342342
2 Ben Parker ben@domain.com 29347728

Does this make sense? I can generate numbers okay, but I don't know how to look for the lowest free number by searching through the list of numbers already in the data file?

Thanks heaps
Aonghas

**GNiewerth** · August 12th, 2010, 02:21 AM

1) Does it make sense to store all your data in memory and occasionally save the whole data? You could perform all changes in RAM and don´t have to bother with fragmentation or holes.
2) Is it necessary to have consecutive numbers or are unique numbers sufficient? It´s easy to determine a new contact´s ID when holes are allowed.
3) Are you bound to a text file? SQLite is a small footprint dbms that compiles into your program, so you don´t need an external DB server.

**aonghas** · August 12th, 2010, 04:11 AM

Hi there GNiewerth

Thanks for your reply. In reply to your questions:

1) I'm not sure what the best/most efficient method would be. I've heard of something possibly like reading the whole .dat file into RAM and then performing all the actions and then writing them back to the file, but I don't know how to go about this. Is this where vectors comes into play?

2) I think the project asks us to fill in the missing numbers after they are removed from the database. So if there are 5 contacts and contacts 2 and 3 are removed, the file would be left with 1,4,5. The next new contact that's added would be 2, then 3, then 6 etc.

3) I have to store the data in a .dat file, though order is not important in the datafile. However when the user enters a command ./contlist (which I haven't written yet), all the contacts in the datafile should be displayed in alphabetical order at the command prompt.

Thanks heaps,
Aonghas

**GNiewerth** · August 12th, 2010, 06:39 AM

Ok, I see. In that case std::map is the container of your choice, because it makes the search for the next free ID much simpler and efficient than std::vector. Please look at this example:

Code:

#include <map>
#include <string>

// the basic contact data structure
struct Contact
{
   std::string Name;
   std::string EMail;
   std::string Phone;
   unsigned int ID;
};

// a container that stores ID/contact pairs
typedef std::map<unsiged int, Contact> ContactMap_t;

unsigned int nextID( const ContactMap_t& Contacts )
{
   if( Contacts.empty() )
   {
      // if there are no contacts the next unused ID is 1
      return 1;
   }
   // optimized for a contact map without holes:
   // if the last entry&#180;s ID is equals to the map&#180;s size
   // then there are no holes and the first unsed ID is
   // last contact&#180;s ID +1
   if( Contacts.size() == Contacts.rbegin()->first )
   {
      return Contacts.rbegin()->first +1;
   }
   // the map has holes.... let&#180;s find it
   // get the first contact from all contacts
   ContactMap_t::const_iterator pos = Contacts.begin();

   // for the first contact we must check if its ID is greater than one
   if( pos->first > 1 )
   {
      // contact ID 1 does not exist, so 1 is the next unused ID
      return 1;
   }
   // iterate over all remaining contacts and
   while( pos != Contacts.end() )
   {
      // check if the current contact&#180;s successive ID is used
      if( Contacts.find( pos->first +1 ) == Contacts.end() )
      {
         // ID isn&#180;t used, return it
         return pos->first +1;
      }
      // advance to next contact
      ++pos;
   }
   // should never get here
   assert( false );
}

This solution makes use of the fact that std::map organizes its entries by comparing the keys and ordering them by std::less, i.e. small keys (=ID) come first. All the function does is to check for each entry if there´s another entry with the next consecutive ID. If there is no such entry we have either reached the end of the map or we found a hole. No matter why, the next unused ID is always the current ID +1.

Edit (1): code may not compile, it´s just to show the idea
Edit (2): ooops, didn´t pay attention to #3
Edit (3): Luckily you have two distinct programs, so one can use std::map<unsigned Contact> while the other can use std::set<Contact> (in conjunction with a comparison object) or std::vector<Contact> (in conjunction with std::sort)

**aonghas** · August 14th, 2010, 04:04 AM

Thanks heaps for your reply. Your method seems like a good idea. I'm just a bit confused about the contact structure. So there are multiple contacts with other information stored in a structure array? What does Contact.end() mean?

I've got a text file with examples like that above eg:

1 John Baker john@domain.com 0211023090
3 Mary Beans mary@domain.com 2342342
2 Ben Parker ben@domain.com 29347728

and I have a class called Contact:

class Contact {
private:
int contactNum;
string name;
string email;
string phone;

public:
unsigned int getContactNum();
void setName(string name);
string getName();
void setEmail(string email);
string getEmail();
void setPhone(string phone);
string getPhone();
};

I was wondering can I make a vector of Contact objects and read in from the data file line by line and store the different information into each contact object, each stored in a different vector? That way I can access any contact's information through the vector. I don't know how to set this up though.. I already have a function that can determine what type of information it is (eg name, email, phone, number)

Thanks
Aonghas

**GNiewerth** · August 16th, 2010, 02:50 AM

std::map is a container that stores key/value pairs. In this case the key type is an unsigned int (the contact ID) and class Contact (the contact data itself). Basically it looks like a table (internally it´s usually a binary tree) :

key | value
----+-------------------------------------------
001 | 1 John Baker john@domain.com 0211023090
002 | 2 Ben Parker ben@domain.com 29347728
003 | 3 Mary Beans mary@domain.com 2342342
005 | 5 John Doe doe@domain.com 1234567

std::map is an ordered container, by default it orders its elements by comparing the the keys using std::less (KeyA < KeyB). If you have two consecutive entries with Key1 and Key2 respectively you can be sure that Key1 is smaller than Key2, in this case you can be sure that contact´s #1 ID is smaller than contacts´s #2 ID. This helps tremendously when finding the first unused ID when adding a new contact.

Of course you can use std::vector, but finding the first unused ID is more difficult because its elements aren´t ordered. You can sort the vector each time you insert/remove an element, or you can determine the correct insertion position for a new entry, but std::map offers all that for without writing a single line of extra code, so I think std::map fits better than std::vector.

If you have questions about std::map´s members you can look them up at C++ reference .

btw:
The performance of my code above can be greatly improved by replacing the find call with an alternative. Can you spot which one?

Thread: Numbering entries in a data file

Thread Tools

Display

Numbering entries in a data file

Re: Numbering entries in a data file

Re: Numbering entries in a data file

Re: Numbering entries in a data file

Re: Numbering entries in a data file

Re: Numbering entries in a data file

Re: Numbering entries in a data file

Re: Numbering entries in a data file

Posting Permissions