Explore
 Lists  Reviews  Images  Update feed
Categories
MoviesTV ShowsMusicBooksGamesDVDs/Blu-RayPeopleArt & DesignPlacesWeb TV & PodcastsToys & CollectiblesComic Book SeriesBeautyAnimals   View more categories »
Listal logo

Best way to handle duplicate entries.

VIP
Moderator
Prelude 17 years, 7 months ago at Nov 1 12:30 -
Tom, now that you've given us ability to edit entries in games, music, and soon in books as well, how do you want us to handle duplicates?

With Edit Item, I can finally clean up titles, apply proper platform, and proper release date.

But what about duplicate entries?

In some cases, where items appear as duplicates, may not be as they are re-released items, such as the game Red Alert. So I added the word (re-issue) too the title so people can know if its the original game from 1996 or the recent re-issue (which may or may not have extras added in).

But in other cases, such as when you search for C&C Generals Zero Hour:
www.listal.com/search/games/1/?query=command+zero

You get the real release (linked to amazon.com) as well as 100% identical amazon.co.uk release, and a 3rd one that seemed to have been added manually. I added (duplicate) in the title of the 2nd Zero Hour from amazon.co.uk, because a few people have that, and I was thinking of removing entire title of 3rd one and calling it (duplicate item), so no one can search for it in future - nobody has that item in their list, so no loss if i do that to duplicate dead items, correct?

But I'm thinking if there's a way to merge the two zero hour expansion packs into one? Or if we collected a list of what we thought were duplicates, and you can fix it up yourself? what do you suggest? and are you ok if I wipe out name of duplicate entries with 0 people attached to it, and simply call the item (duplicate item)? would it mess up the database by doing so?
Moderator
Admin
Tom 17 years, 7 months ago at Nov 2 0:44 -
I'm intending to write a script which will merge titles with the exact same name (and platform) (or music title and artist) into one, so the best thing would be to change the titles to match. In fact there may be no need to add a manual merge function as the titles could be changed and when the script ran they would be merged.

The only question is if there is still support for people to still select the uk edition to appear in their lists rather than the us one, I don't think people are so bothered about this with games but with books the same book may be released many times with different artwork so this may bother some people.
Phil 17 years, 7 months ago at Nov 2 1:12 -
I'm a bit dubious about an actual merge since titles have different UPCs, release dates and covers - for example, US and English versions of Is This It www.listal.com/music/is-this-it-the-strokes www.listal.com/music/is-this-it-the-strokes-83873 or certain titles shared between Album and Single (such as Parklife by Blur if I hadn't yesterday changed the singles to have the [CD 1], [CD 2] suffix) two completely different releases like the original and live versions of The Man Don't Give a F**k (Super Furry Animals) from almost a decade apart, my UK and Australian versions of M.O.R. (Blur again). I've got plenty more examples where they came too.

They need linking to each other in some way, but the original versions really, really, really need to still be available otherwise I'll actually be losing loads of data and end up having to go right through my collection to recatalogue somewhere else again. It's a real showstopper for actual physical collections.
Moderator
Admin
Tom 17 years, 7 months ago at Nov 2 1:59 -
Ok I will leave the seperate editions but just share the ratings and reviews between the two.

The singles having the same name as the albums issue could be a major problem if there is no way of telling them apart (amazon may supply this data)
VIP
Moderator
Prelude 17 years, 7 months ago at Nov 2 11:50 -
ok, fair enough. but what about games with 0 played and 0 wants that are manual entry duplicates? Are you ok if we rename title to DUPLICATE just so it gets buried in the database? and if you want to clean up database later, you can simply delete all entries called DUPLICATE ?

also, anyway to make it more obvious one title is US edition and other title is UK edition?
VIP
Moderator
Prelude 17 years, 7 months ago at Nov 3 15:31 -
www.listal.com/search/games/1/?query=half%20life

Ok, Tom, as well as everyone's who's into editing entries to clean them up, check this out. I clean up all entries for half-life series.

1. I removed all duplicates and/or manual entries that had 0 watched and 0 wants by naming them (duplicate), even removing the 'Half Life' title in them. (they can't be seen anymore, except by searching for word (duplicate).

2. Half Life titles that are duplicates or triplicates but that have one or two people listing the item, I renamed the title to add word (duplicate) and i also changed Platform to Duplicate. This way, it stays in their list, but hopefully will give them a clear sign they should be listing/rating the proper item, not the duplicate.

3. In terms of entries from Amazon.co.uk vs amazon.com, some of the items, especially older entries, have more people listing the UK version than the US version. And since sometimes these entries are different (especially with music albums) its best to keep both of them (unless Tom comes up with a way to merge them). But I decided to do two things, to help keep it more obvious which is which. First, I added (UK) to the version sold in amazon.co.uk. And I also tried to find the UK version of the game's box (US versions are rated E, T and M, while UK boxes are typically 12+, 16+ etc.)

Hopefully people will notice this and switch to the proper version they have. I'm a bit torn as to adding (UK) to all amazon.co.uk entries, because of clutter, but I wanted to help keep it easy for people doing searches to know there aren't tons of duplicates, but rather US and UK versions. For music albums, if some are Japanese imports, or other country versions of same album, maybe all the two letter country identifier as well to all titles.


The result of this (I hope) is a lot cleaner look when doing searches, less confusion about duplicates, and easier to find rare items that are buried on end of search results.

any thoughts/comments? also, any ideas what to do with dupes that people have listed? is it possible to do force moves to proper items and then wipe out the duplicate entries?
Deleted user
Deleted 17 years, 7 months ago at Nov 3 17:44 -
Some databases also use the original cover for a book, or the first one they have, for older books (ie, The Argonautica). Just a small thought for later in the clean up.

I'll look for potential Duplicates.
Phil 17 years, 7 months ago at Nov 4 2:05 -
Personally, I'd actually quite like the opportunity to add merges manually that people - ideally the top points owners - then can accept or reject. So long as the likes of Prelude, tartan_skirt and so on have a decent set of guidelines, I'd be quite happy with others voting on suggestions. I quite accept that there's merges that do need to happen, and that links need to exist for reviews/ratings purposes as well as to ensure that people that own a US vesion of an album don't get recommended the UK version (for example).

RE: point 3, I'd personally be dead against having "UK" in the title of any of my products - it's not part of the name at all damnit - but that's really what the region works well for... if we had the ability to set DVD region and whether a game is PAL or NTSC (and even Music release region!) then that should cover most of the issues along with covers. If people then list the wrong one, it's really their own problem, particularly if Tom can solve the linking the different versions problem, since it doesn't affect the reviews or recommendations that are shown.

Basically, I do see the need for merging and agree wholeheartedly with the principle, but just don't feel that it's a problem that can be solved automatically IMHO - it's a problem that's way beyond human solutions.
VIP
Moderator
Prelude 17 years, 7 months ago at Nov 5 2:51 -
Yeah, adding (UK) to the games is more of just trying to get this US vs UK editions noticed, but i think 2 duplicates is bearble rather than 10 or more. So I'll remove the UK from those Half-Life titles. Its the one thing I wasn't too sure about myself, so glad to hear your thoughts on it, Phil.