Page 1 of 1

More science than admin

Posted: Thu Feb 06, 2025 8:26 am
by asimj1
Larger version of figure / accessible pdf of figure.

What I have learned is that, in applications for extracts of large-scale data, the research team should be as selective as possible in the variables requested, increasing the chances that the data needed are anonymous. Otherwise, the application and approval process becomes longer, and in some cases sharing for research is not permitted due to a asia rcs data conflict with the privacy information shared with the data subjects.

Data linkage remains a challenge because some form of meaningful identifiers or pseudonyms are required for matching.

Machine learning approaches face similar challenges in that a broad range of potentially informative variables are needed for the analysis, and some algorithms might increase the risk of re-identification.

In some cases the analysis just won’t be worthwhile with an anonymous extract, but careful consideration of the variables needed for the project will facilitate both linkage and data applications.

Similar considerations need to be made by research teams planning to collect data intended to be shared for open research.