samedi 11 juillet 2015

What's the best data structure when many of the returning data are repeated?

I am designing a data structure of a search site. I made a sql table which looks like the example shown below. If a user enters a key of the table below (column A), I want to return the corresponding column B record. My approach was to index the records of column A and store the records of column B using Lucene, but I realized it would not be the best because actually many words of column B are duplicate (See that "take" is used twice in column B, and it is used as a key in column A) In this case, what's the memory-efficient data architecture?

e.g.

Column A    Column B

make           take, sell, look, love, time

take             care, people, search, love

sell               cook, see, take, time

Aucun commentaire:

Enregistrer un commentaire