C: Removing the Duplicates

1: What are duplicate search results?

If you search several resources and databases for the same research question, you will definitely have several duplicate search results.

If the bibliographic information of two references, papers, or search results is the same, we consider them duplicates. If you are using a bibliographic management program such as EndNote, you may see duplicate references when sorting the records by title.

*Bibliographic information varies depending on the type of reference. For a journal paper, the bibliographic information is: Authors’ Names, Title of the Paper, Name of the Journal, Volume and Issue of the Journal, and the Pages on which the paper has been published. DOI, PMID, Abstract, etc., might be counted as bibliographic information as well.

2: Why are there duplicates among the search results?

As the volume of academic literature grows, it is impossible to keep up with every journal. That’s why bibliographic databases such as MEDLINE, EMBASE, CINAHL, and PsycINFO collect the bibliographic information of papers published in each journal to let you search them all in one place within a single database – so-called ‘indexing’.

To make their papers more visible, journals aim to be indexed in as many important databases as they can. It means if you search several medical databases for the same research question, you will find the same paper several times because the journal has been indexed in more than one database.

3: How to find and remove the duplicates?

There are several ways to remove the duplicates.

3.1: Removing duplicates in databases while searching

If you consider MEDLINE as the core database for your search, you can remove MEDLINE records while searching CINAHL or PubMed. Also, you could delete the results of MEDLINE journals from EMBASE or limit your search to EMBASE when searching EMBASE.

These methods look cool; however, there is a major problem with all of them: how do we know that by following these methods we are actually deleting the duplicates, and whether these methods are working well? We don’t know really!

3.2: Removing the duplicates using the duplicate detection feature in reference management programs

If you are using EndNote or any other reference management program, you could define what a duplicate is, and the program could detect and display the duplicates to you so that you could decide and delete the duplicate records.

3.3: Manual Removal of the duplicates in reference management programs

If you sort your references by Title within your reference management program and check them, you could detect and delete them manually. If the search results are a lot, this method will be very time-consuming.