Typology of Duplicate Records in Systematic Review Context

Duplicates in Systematic Review

Note: There is no advertisement or marketing component in this post.

Those who conduct systematic reviews are aware that after the search is done in more than one database, it is natural to have duplicate records; however, these are not the only duplicates that the review team deal with. Duplicates occur in several stages of the systematic reviewing process, and dealing with them is usually confusing and requires skills. 

Intra-Database (Cross-Database) Duplicate Records

Since the researchers and reviewers are now searching the bibliographic databases to find the literature relevant to their research, many publishers do their best to index their journals in as many relevant databases as possible. Why? Because that is the best way to make their journals more visible.

When we search more than one database — the norm of systematic reviews — with the same or similar search strategies, the same journal papers appear among the search results of several databases. When we export the search results from all databases into a citation manager program such as EndNote or Zotero or Mendeley, or others, you have an option to define, find, and remove these duplicates — so-called de-duplication

Some of such duplicate records cannot be recognised in search, de-duplication, or even title-abstract screening stages of the systematic reviews because their details are not usually the same at first sight. For example, non-English records and journal names are being indexed differently in each database. Since there is more than one reviewer involved in the post-search stages of the systematic review, one reviewer cannot see all the records. It is easy to overlook some of these records; reviewers usually identify some duplicates during full-text screening, data extraction, or sometimes after meta-analysis or peer-review stages.

Some databases provide options for you to remove the records from a certain database. For example, CINAHL allows you to exclude MEDLINE records; other than that, we usually have to use manual, automated, or semi-automated methods to find and remove duplicate records.

Inter-Database Duplicate Records

Sometimes, each database may have the same record more than once. It could be a simple double-entry error, or a version-control error could cause it. Many publishers nowadays publish their papers as e-pub or early view or online first to make it accessible to readers with no or less delay from acceptance date. These papers usually have a unique DOI number but not a set year of publication, volume, issue, or page numbers. In turn, some of the databases grab such early in-press publications and index them to make them available for their users. What happens is that when the full paper is published in a paginated format with full bibliographic details such as year, volume, issue, and page numbers, the databases may forget to update these details or add the fully published paper again. The same paper title may appear twice, if not more, among the search results of the same database.

Intra-Search (Cross-Search) Duplicate Records

Systematic reviews are as update as their search date. Most of the important systematic reviews are being published within 12 months from the search date, and if there is a delay, they usually run an ‘update search’. Even after the publication of systematic reviews, there are always reviewers who try to update them. 

There are three ways to update a search: Auto-alerts, running a full search, or date limitation.

  1. Saving the searces in the database’s user account and setting automatic periodical search alerts to receive the new results in your inbox;
  2. Running the update search from scratch and de-duplicating the new search results against the previous (old) search results;
  3. Running the update search using date limitation options in each database; such limitation could be to Date Published, Date Entered, Data Created, Publication Week, or Publication Year depending on how elaborately a database indexes these details.

Running an update search will also create Intra-Search Duplicate Records. For example, if you run a search in 2010 and then update it in 2015 and no matter how accurate your method of updating is, you will realise that there are always records that you have already seen in the 2010 search, and they also appear in 2015 search.

This may happen for several reasons, including but not limited to:

  1. The databases indexing speed is different. Database A may index a record a few months or a year after Database B;
  2. The database updates the e-pub records and assigns new dates or year of publication;
  3. The database updates the records for any reason and adds a new date such as ‘date revised’ or ‘date entered’.

Intra-Method (Cross-Method) Duplicate Records

Systematic searching of the bibliographic databases is the main but not the only way to find the relevant studies for the systematic reviews. Contacting the experts, checking the list of the references included studies, tracking the citations to the included studies, and so on are also among the other methods.

The reviewers are usually confused about reporting the duplicates found from these methods in their PRISMA flow diagram because the main de-duplication is reported immediately after the search stage. Still, the checking references and citations are after the full-text screening stage. So, if there are duplicates between the records from the systematic search method and checking the reference method, it is unclear where to report them. In larger systematic reviews, there is almost always such duplicate records.

Inter-Study Duplicate Records

Once the researchers secure funding for a research project, they try to create as much academic output as possible. It is prevalent in medical sciences that researchers publish the findings in several papers and present them in several conferences. Such dissemination will create conference abstracts that have been presented in different conferences but with the same or very similar title, abstract, and authorship.

While many reviewers consider such abstracts as duplicates or unimportant — they may be right — they are not considered duplicates in a systematic review; rather, they are different reports of the same study.

The best way to deal with them is to keep them under one study name and cite all of them — So-called Studification. For example, Jackson et al. 2021 [8–12]. This way of dealing with them has several benefits:

  1. The reader would know that although this is one study, this study has been reported in several papers;
  2. The systematic effort of identifying all the reports has been documented properly and shows how carefully the researchers have checked every single paper;
  3. If you delete them, the aware readers and users will be confused why you have not included this or that paper or conference abstract; by keeping them, you answer their question that those reports all belong to the same study;
  4. Although these duplicate records may not add anything new when they do, they usually report important missing details or discrepancies. For example, they may report more participants than the original full paper and help you critically appraise the reason for missing those participants.

One of the recent categories of duplicate records in Inter-Study Duplicate Records is Intra-Version (Cross-Version) Duplicate Records. More researchers tend to release their manuscripts earlier or in formats other than journal publication with recent movements towards open science. This release is either through pre-print servers such as Arxiv, medRxiv, bioRxiv, or others, or the institutional repositories. They are duplicates of the published version of the paper; however, such as the above-mentioned conference abstracts, they may help detect discrepancies and important details.

Inter-Dataset Duplicate Data

This is one of the trickiest duplicates to deal with. When a dataset is available, the researchers tend to play with it and publish as many papers as possible. It is possible to identify salami publications of the same research; however, it is always possible that the separate reports of the same study may use the same or similar data and cause Inter-Dataset Duplicate Data. 

Inter-Dataset Duplicate Data have several categories depending on their release: Time-dependant, salami, data volume dependant, imprisoned data, and open data.

  1. Time-Dependent Release: Researchers report only part of the results such as primary results in one paper, the final results in another paper, and follow-up results in a third paper;
  2. Salami Release: To increase the number of their publications, they report only one part of the findings per paper;
  3. Data Volume Dependent Release: Since there are a lot of data generated from the research, the researchers have no choice but to report it in several papers because the journals have a limitation of paper length;
  4. Imprisoned Data Release: Since the researchers have access to the private dataset, they publish several papers from those data even a decade after the end of their research. Such publications appear in journals as post hoc analysis or secondary analysis papers;
  5. Open Data Release: The research dataset is open online for the public, and any researcher can access and generate publications out of these data.

The reviewers need to assess and choose the high quality and the most comprehensive report of the dataset.

Intra-Report (Cross-Report) Duplicate Data

Those who are able to run multiple research studies alongside each other — mainly pharmaceutical companies — also tend to publish the findings from those studies together. They usually publish multiple papers, but each paper reports more than one study. While separating these data in an understandable and analysable way is not always easy, it is also difficult to identify unique data per a study from these papers. There is almost always overlapping/duplicate data.

Conclusion

Unlike the simplistic viewpoint that considers finding and removing duplicates as an easy and single step of systematic reviewing, it requires skills to prevent, identify, and remove duplicate and redundant reports and data. Duplication can be detected at any stage of the systematic review.

Structure of Search Strategies for Systematic Reviews: Line by Line versus Block by Block versus Single-Line

Search Strategy for Systematic Review

Over the past 18 years, I have had this discussion several times with information specialists, researchers, students, and colleagues. Some seem to be defending a single approach at the beginning of the conversation, but as we go forward, they change their minds.

Unsurprisingly, all three approaches of structuring the search strategies are useful at the right time and place.

Since there is no best practice guide, many get confused about what structure to choose for each purpose. Before I start explaining and avoiding wasting your time, this post is not about structuring the search based on PICO families (PECO, PIPOH, PICOS, etc.).

Line by Line Search Strategy

Borrowed from computer programming, we put the single term in each line in this structure, and there is no guide on how much information you can put in each line. Since each line is concise — usually one term or one set of terms followed by field tags — the entirety of the structure becomes long, and you have to scroll from top to down to see all the lines. In some cases, there are 300–400 lines! At the same time, since each line carries only a little information, it is easy to follow each line and line numbers to figure out the structure by looking at Boolean logic such as AND, OR, and NOT. Line by line searching makes it very easy to find the errors. The main trouble is that if you want to run this search, you have to copy the line, paste it into the search box, and press Enter/Search 300–400 times, depending on the number of lines. If this repetitive task does not turn you into a mad hatter, it will waste time. On the other hand, it is easy to check and peer-review such strategies and shows the contribution of each term/line to the search, so it can be beneficial when you develop/validate a search filter. I use this structure for training purposes as students get it well and find the errors on the spot. See a single example from Ovid Embase below:

1 exp Black Person/

2 Black*.ti,ab.

3 African*.ti,ab.

4 Multiethnic*.ti,ab.

5 “Women of Colo?r”.ti,ab.

6 or/1–5

7 Maternal Health Service/

8 exp Pregnancy/

9 Prematurity/

10 Pregnancy Outcome/

11 Perinatal Care/

12 Pregnancy Complication/

13 Pregnant Woman/

14 Pregnan*.ti,ab.

15 Birth*.ti,ab.

16 Post?natal.ti,ab.

17 Gestation*.ti,ab.

18 Peri?natal.ti,ab.

18 Post?partum.ti,ab.

20 Matern*.ti,ab.

21 Ante?natal.ti,ab.

22 Pre?natal.ti,ab.

23 or/7–22

24 Health Disparity/

25 Health Care Disparity/

26 Racism/

27 Prejudice/

28 Social Discrimination/

29 Health Equity/

30 Social Justice/

31 Racis*.ti,ab.

32 “Racial Prejudice”.ti,ab.

33 Discrimination.ti,ab.

34 Disparit*.ti,ab.

35 Inequit*.ti,ab.

36 Inequalit*.ti,ab.

37 Equalit*.ti,ab.

38 Equity.ti,ab.

39 Bias*.ti,ab.

40 or/24–39

41 exp United States/

42 (United States).ti,ab,ad,gc,go,in.

43 USA.ti,ab,ad,gc,go,in.

44 “U.S.”.ti,ab,ad,gc,go,in.

45 “U.S.A.”.ti,ab,ad,gc,go,in.

46 or/41–45

47 6 and 23 and 40 and 46

Block by Block Search Strategy

In this type of structuring, you have one line per search concept. If you break your question into PICOS — or any other framework — elements, you will have 5 lines: P (population/ problem), I (intervention), C (comparator or control), O (outcome-rarely included in the search), and S (study design). The last line will combine all lines with AND to give you the final results.

In the following example, you only have to copy and paste 5 times, and you can see the search structure, which reflects the research question. It can be read easily, and you won’t need lots of scrolling unless you really have to run pages of terms. This structure is understandable to those who are not searching experts, and the structure helps you test combining 2, 3, 4, or 5 and mix and match the blocks to see what number of results is more reasonable. It must be noted that it may not be possible to break down searches into CLEAN single-concept blocks in complex searches and some blocks may have sub-blocks and sub-sub-blocks.

1 exp Black Person/ or (Black* or African* or Multiethnic* or “Women of Colo?r”).ti,ab.

2 Maternal Health Service/ or exp Pregnancy/ or Prematurity/ or Pregnancy Outcome/ or Perinatal Care/ or Pregnancy Complication/ or Pregnant Woman/ or (Pregnan* or Birth* or Post?natal or Gestation* or Peri?natal or Post?partum or Matern* or Ante?natal or Pre?natal).ti,ab.

3 Health Disparity/ or Health Care Disparity/ or Racism/ or Prejudice/ or Social Discrimination/ or Health Equity/ or Social Justice/ or (Racis* or “Racial Prejudice” or Discrimination or Disparit* or Inequit* or Inequalit* or Equalit* or Equity or Bias*).ti,ab.

4 exp United States/ or (United States or USA or “U.S.” or “U.S.A.”).ti,ab,ad,gc,go,in.

5 1 and 2 and 3 and 4

Single Line Search Strategy

It is possible to combine all the concepts and terms in a single line. This way, you only need to copy, paste, and press Enter only once rather than several or hundreds of times; for this why all the searches that I ran when I worked as a clinical librarian in the emergency department were in the single line structure. It may take the system some time to digest your search query, but a good system will finally give you the results while drinking your coffee/tea/water. In some cases, the system may give you error(s), which will not be easy to find in a large search. Since you do not have many lines, the search does not take much space and does not require scrolling down. If you are busy or lazy, this may work just fine.

(exp Black Person/ or (Black* or African* or Multiethnic* or “Women of Colo?r”).ti,ab.) and (Maternal Health Service/ or exp Pregnancy/ or Prematurity/ or Pregnancy Outcome/ or Perinatal Care/ or Pregnancy Complication/ or Pregnant Woman/ or (Pregnan* or Birth* or Post?natal or Gestation* or Peri?natal or Post?partum or Matern* or Ante?natal or Pre?natal).ti,ab.) and (Health Disparity/ or Health Care Disparity/ or Racism/ or Prejudice/ or Social Discrimination/ or Health Equity/ or Social Justice/ or (Racis* or “Racial Prejudice” or Discrimination or Disparit* or Inequit* or Inequalit* or Equalit* or Equity or Bias*).ti,ab.) and (exp United States/ or (United States or USA or “U.S.” or “U.S.A.”).ti,ab,ad,gc,go,in.)

Since it may seem a dull collection of words, I use formatting — such as Bold, Underline, and Italic — and colour-coding [see top image] to separate the blocks for the audience.

Summary

The structure of the search strategy depends on the purpose of the presentation of the search strategy, your or your customers’ preferences, the complexity of the search, and the database’s search capabilities.

I must confess not all the search interfaces allow you to use all three structures. Oddly, sometimes they may allow you to run a single line search, but the next time the search interface may stop you with an unknown error or break down/parse your query differently than the intended way. Assuming that you have all three options, I summarise the pros and cons in the following table, which is not complete, but I am happy to add more if you share:

Search Strategy for Systematic Review

Remember: in complex searches, we may have to use a combination of these three structures.

On 9th June 2021, I updated the entry to include input from a Twitter conversation. Special thanks to the contributors.

Originally appeared first at Medium.

Cite as: Shokraneh, Farhad. Structure of Search Strategies for Systematic Reviews: Line by Line versus Block by Block versus Single-Line. Medium 6 June 2021; [Revised 9 June 2021]. Available from: https://farhadinfo.medium.com/structure-of-search-strategies-for-systematic-reviews-line-by-line-versus-block-by-block-versus-d59aae9e92df

How to Write a Literature Review?

Before You Start

This is one of the frequently asked questions by my students and my colleagues and I go straight to the answer with my famous answers which are actually questions and the answer to these questions will tell you how to write a literature review:

Who is going to write the review?

Is it you, a team of authors, or are you asking an external review writing consultancy services to write it for you? The answer to this question is important because it guides the planning stage.

What is the review about?

The topic or subject of the review should be specified. It might be too broad and so you will be dealing with lots of literature or it might be too narrow so it will be very hard to find the relevant literature.

Why do you write this review?

Is it a coursework or a paper for publication or presentation or it is for a blog or podcast? Is it a standalone work or is it part of the bigger plant such as a work package for research or a chapter in dissertation? Depending on the purpose your plan may change.

How much time do you have to write the review?

The timing is the key because it definitely affects the quantity and quality of your work. It is always advised to plan ahead.

Where do you want to publish or present or submit the review?

If you plan to deliver it to someone else, who are they? Is it an organisation or a journal? Are they following a certain style or guidelines for writing the review and do they present any example? Again it is important to know the destination so you don’t have to fit a square peg in a round hole.

What’s the length, format, and type of the review?

You need to know if there are work limitations or formatting standards to follow. And finally you need to know the type of your review because we do have over 51 types of literature reviews. When people talk about ‘literature review’ they usually mean the most traditional type of review which is ‘narrative review’.’

How the quality of this review is going to be?

Do you want high quality work to be presented as academic work and shine in your resume for the rest of your career and life or are you looking for something ‘good enough’ to be submitted and get you a mark for a course. The answer makes a huge difference because by publishing an academic work it will become part of your professional reputation.

Resources

You have three types of resources to juggle and I call it Resources Triangle:

  1. Money: with money you can hire consultancy services and pay a freelance professional to write your review. It is usually easy and does not require any legal contract however ethically, you should acknowledge the writer in the final product unless you agree otherwise. If so, you become a ‘guest author’ and the original writer stays a ‘ghost author’. Although controversial, it is usually accepted by many consultants. If you don’t have money, do you have people or time instead.
  2. Human Resources: you may have a team or fiends or students who can help you with writing or it could be you on your own. It is important to know who you have and for how long so you could plan timewise. Working with a team it is important to decide about the amount of contribution from the beginning until end so when the work is done you know everyone’s contribution and the order in which everyone’s name should appear on the final work. You also need to know how skilled they are, in particular are they fluent in using any computer programs? If you need human resources can you hire them?
  3. Time: How much time do you have and can you make time by adding people or spending money.

These three resources are totally connected and balance each other. For example if you have enough time, you can write a review gradually and at less cost with less human resources. If you don’t have time, then you may buy some other people’s time so they write the review for you.

Planning

Step 1: Choose a review type

Depending on your purpose, you may choose one of over 51 types of literature reviews and follow their steps. Otherwise if you choose to go with a traditional and narrative review then you can follow the next steps.

Step 2: Choose a topic for your review

Start with a topic of interest and look around the internet for half another. See if there’s anything in Wikipedia or any other recent review papers relevant to the topic. For example if you are working on a review about machine learning in neurology, you may find the paper about machine learning in cardiology. It is important to see similar topics and similar reviews because you can follow their footsteps in structure and content.

Step 3: Outline the topic of your review

Draw a map of the topic and its subtopics to allow structuring your paper. For example, if you are wiring about treatments of migraine, you can start with what is migraine and how it starts and then move to possible suggested treatments pathways. You can classify the treatment based on each pathway or classify them in categories such as psychotherapies, pharmacotherapies, and allied and alternative medicine. You can continue comparing them in a table or list the most recent systematic reviews about their effectiveness in a table. Finally you can have another section on the most recent ongoing promises or finish with current clinical practice referring or citing the clinical practice guidelines.

Step 4: Determine the amount of text/references per review section

You have no choice but to have a stop point otherwise your excitement and enthusiasm will drain your time and energy. For example, specify that you will have one or two or three paragraphs per subtopic and each paragraph is going to have two-three references. Of course you need to run searches to find the relevant literature. It is not a rule. For example, one paragraph will be enough for your last section which is usually the conclusion.

Step 5: Make the order and content of the review logical and connected

See if the structure of your writing makes sense. Are the paragraphs connected in a way that it is easy to follow and read? If this writing was someone else’s what would be your opinion on improving it? Is it boring or interesting?

Step 6: Choose a writing style for your review

Are you going to summarise the most recent literature? Or are you trying to critically appraise a topic or an idea? Or otherwise you care about the pros and cons on a topic or idea?

Step 7: Leave the review aside for one or two weeks

When you spend too much time on your writing you probably cannot see the points to improve or mistakes. If you leave your writing aside for a while it gives you the chance so when you return to it later you can see new things, you will have new insights and new ideas.

Step 8: Ask for independent view on your review

Although not always possible, it is always good to ask for an independent view from someone else. You can do this alongside the previous step to save time.

Last words

I know this is not a perfect guide on how to write a literature review but it is general and easy enough to help you start your review.

Good news

1. If you don’t like to comment and share your ideas or questions, you can contact us instead.

2. I will update this blog again based on your feedback.

Systematic reviews for informed decision making: From shopping to healthcare research

Knowledge is power BUT either you are shopping online or you are working on a research project in your university/company, knowledge is not easy to find or cheap to gain.

If you have tried to purchase a product and spent hours if not days and months doing ‘research’, you’d know what we mean. Here by ‘re-search’ you usually mean ‘search’ and ‘search again’.

Since commercial companies are aware of your efforts, they usually try to bombard you with ‘selective’, ‘wrong’, ‘ambiguous’, or ‘incomplete’ information so as a judge what would you do when one says [in their heart and mind]:

I Swear NOT to Tell the Truth, Tell Part of the Truth and Mix It with Lies

Well, your decision making will be ‘biased’ towards what they want. That’s why you try to be ‘systematic’ before buying a product:

  • You search multiple websites;
  • You read customer reviews;
  • You check five-star ratings;
  • You compare the features and prices;
  • You check the guarantee, return/refund policy;
  • You ask your colleagues, friends and families, and considering all these information …
  • You finally decide to drink a cuppa coffee/tea, give it a time and buy the product 60% off in Boxing day!

Don’t be surprised if I tell you that you can follow a career in research and you will find all the shopping steps to be similar to a medical research/literature review!

When as university, you usually are being asked to do coursework which is kind of ‘research’ but not the rigor kind so you usually don’t dare to publish it or share it publicly. Later, when you start a career, you are still looking for the knowledge to make good or informed decisions. Even if you become a policy maker, these research will never stop because knowledge is power.

What makes the process of this ‘systematic’ re-search difficult is that first, you do not have time and second even if you have time you do not have the skills to run a proper and comprehensive research on existing knowledge or even existing ignorance [fear of unknown]!

If you miss a relevant work in your research, you can end up with re-inventing the wheel if you are lucky or a rectangle wheel if you are less lucky or be accused of stealing someone else’s wheel and that’s bad luck! Why leave it to luck if there is a way to get it right?

Systematic reviews start with a well-written question followed by systematic search in multiple resources seems to be the the best tool you ever had to collect and analyse all the relevant information to make an informed decision. However, since systematic reviewing can be a time-consuming and resource-intensive process, there are at least 48 types of literature reviews with some ‘systematic’ elements embedded in them. Yesterday, we discovered 49th type! Each type of literature review fits a certain purpose so don’t you dare to think the literature reviews are only ‘narrative’ or ‘systematic’. Such classification has unforgivable consequences including time and money waste if not lives.

This is just an introductory blog, we will write more soon