Dialog Box

E.8.5 ICGC Security Best Practices for Controlled-Access Data

E.8.5 ICGC Security Best Practices for Controlled-Access Data 

Version 2.0 August 2021 

Download as pdf

1. Introduction

This document is intended for users and officials at academic institutions and scientific  organizations whose investigators have been granted access to the controlled tier of the  International Cancer Genome Consortium (ICGC), comprising both the ICGC-25K project (http://icgc.org) and the ICGC-ARGO project (https://platform.icgc-argo.org). It provides an outline of the  ICGC’s expectations for the management and protection of ICGC controlled access data transferred to and maintained by institutions whether in their own institutional data storage systems or in cloud computing systems. The principles governing access and use of such data are outlined in the ICGC ARGO Data Access and Use Policies and Guidelines web pages. The data handling guidelines described in these policies are intended to ensure that ICGC controlled access genomic and phenotypic data are kept secure and that access is limited to DACO approved researchers. Two distinct audiences are targeted by these guidelines: scientific and administrative professionals including institutional signing officials and investigators that will use the data, and information technology professionals, including Chief Information Officers (CIOs), Information Systems Security Officer (ISSOs) and operations staff working for both central IT organizations and embedded within research groups. Accordingly this document is split into separate sections focused on each group.

2. Information for Scientific and Administrative Staff  

General Consideration 

Under ICGC ARGO policies, the DACO-approved researcher’s institution is ultimately responsible for  maintaining the confidentiality, integrity and availability of the data to which it is entrusted by the ICGC. Failure to provide appropriate controls can subject investigators or institutions to sanctions defined by the DACO as well as erode public confidence in the ability of ICGC and its stakeholders to carry out research using sensitive information. It is therefore essential that all  recipients of controlled access data understand their responsibilities for ensuring appropriate  information security controls and that they work with their IT organizations to effectively  implement those responsibilities. While the ICGC provides this Best Practices document as a general guide to acceptable  security practices, this document is not a substitute for a formal security plan that is devised for  the specific local or cloud configuration chosen by the investigators and institution. The ICGC  strongly recommends that investigators consult with institutional IT leaders, including the Chief  Information Officer (CIO) and the institutional Information Systems Security Officer (ISSO) or  equivalents to develop the formal information security plan prior to receipt of controlled access  data from the ICGC; institutional signing officials should validate that an appropriate security plan is in place prior to accepting liability for data loss or breach on behalf of the institution.

This document provides an overview of security principles for data, access, and physical  security to ensure confidentiality, privacy, and accessibility of data. This is a minimum set of  recommendations; additional restrictions may be needed by your institution and should be guided by the knowledge of the user community at your institution as well as your institution’s IT requirements and policies. The single most important element (regardless of type of infrastructure) for maintaining the security of ICGC controlled access data is to design security into the chosen environment before the data is transferred rather than attempting to add security controls to an environment after the data has been transferred (protection by design). Security controls should be on by default; investigators and users should not have to perform any active action to turn them on.  To use an analogy, doors should be locked by default rather than need to be actively locked by someone. A corollary is that all users and support staff associated with the project need to have an information security mindset going into the project, and all must be aware that public support  for the collection and dissemination of these types of data are their individual responsibilities,  and it is essential that all staff members that will interact with the data or the systems that  maintain the data have appropriate information security training. This is particularly true for  groups that wish to use cloud computing, and in these cases, the ICGC recommends additional  training to inform staff of the special risks that the use of such infrastructure entails.  

Part of having an information security mindset is being aware of the multiple dimensions of  access control and accountability at all times. This means ensuring that passwords and/or  access devices (smart cards, soft or physical tokens, etc.) are physically safe, strong and not shared with anyone and that data are both physically and logically (i.e., electronically) secure.  Particular care must be taken with copies of data on portable electronic media and devices (i.e., laptops, tablets, USB thumb drives, tapes, etc.). Generally speaking, users should avoid putting  controlled access data on such devices wherever possible. If it is necessary, such devices must  be encrypted and should be treated as if they are cash, with appropriate physical and electronic  controls, including remote wipe capability wherever possible. In addition, please remember that  collaborators at different institutions must file a separate data access request even if they are  working on the same project.  

Finally, remember that data downloaded from ICGC-designated data repositories must be destroyed when they are no longer needed or used, or if the project data access period has expired or been terminated.  Investigators may retain only encrypted copies of the minimum data necessary at their  institution to comply with institutional scientific data retention policy and any data stored on temporary backup media as are required to maintain the integrity of the general institutional data  protection (i.e., backup) program.  

3. Additional Information Related to the Use of Cloud Computing  

Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to  a shared pool of configurable computing resources (e.g., networks, servers, storage,  applications and services) that can be rapidly provisioned and released with minimal  management effort or cloud service provider interaction. In contrast to traditional computing on  local servers and hardware, cloud computing often entails the transfer and storage of controlled access data on systems managed by a third party. Cloud computing offers a number of  advantages for authorized investigators but also requires additional  security considerations.  

Most of the recommendations described above apply to cloud computing; indeed, the primary difference is that while information security in cloud environments is still the responsibility of the institution, the implementation of that security is shared between the institution and the cloud  service provider. Thus, it is essential that institutions validate that they are partnering with a reputable cloud service provider. Institutions should ensure that they understand the security policies and practices utilized and recommended by their cloud service provider of choice, and may wish to obtain third party reviews or audits from the cloud service provider. Institutions should utilize these best practices, work with their cloud service provider to understand and implement the best practices associated with their specific environment and ensure that the cloud service provider can meet institutional information security requirements. Because misconfiguration of cloud resource access policies can potentially result in broad compromise of controlled-tier data, the ICGC strongly recommends that you consult with your institutional CIO,  ISSO and IT staff to ensure that an appropriate security plan is developed and that necessary  technical, training and policy controls are in place before data is migrated to cloud  environments. You and your institution are accountable for ensuring the security of  this data, not the cloud service provider. 

4. Information for IT Professionals  

Local Infrastructure Guidance  

General Information Security Guidelines  

  • When using local infrastructure, make sure these files are never exposed to the Internet  with the exception of such connections as are required to download data from source  repositories. Infrastructure should be behind local and/or institutional firewalls that block access from outside of the institution. For cloud infrastructure, investigators must restrict  external access to instances and storage under the investigator’s control (see section on  cloud computing for more details);  
  • Data must never be posted on servers in any fashion that will make them publicly accessible, such as an investigator’s (or institution’s) website, because the files can be  “discovered” by Internet search engines, e.g., Google, Bing etc; 
  • Institutions must not set up web or other electronic services that host data publicly, or that provide access to other individuals that are not listed on the Data Use Request even if those individuals have access to the same ICGC data. 
  • Utilize strong authentication technology for access control. Two factor authentication technologies (smart cards, hard or soft token, etc.) are preferred. When using single  factor passwords, set policies that mandate strong passwords. For example:  

○ Minimum length of 12 characters; 

○ Does not contain user names, real names or company names; 

○ Does not contain a complete dictionary word; 

○ Contains characters from each of the following groups: lowercase letters,  uppercase letters, numerals, and special characters; 

○ Passwords should expire every 120 days or at the rate required by institutional  policies, whichever is more frequent.  

  • Avoid allowing users to place controlled access data on mobile devices (e.g., laptops,  smartphones, tablets, MP3 players) or removable media such as USB thumb drives  (except where such media are used as backups and follow appropriate physical security  controls). If data must be placed on mobile devices, it must be encrypted;  
  • Keep all software patches up to date. 

Physical Security Guidelines  

  • Data that are in hard copy or reside on portable media, e.g., on a USB stick, CD, flash  drive or laptop should be treated as though it were cash, with appropriate controls in  place. Such media must be encrypted and stored in a secured in a locked facility with  access granted to the minimum number of individuals required to efficiently carry out  research;  
  • Restrict physical access to all servers, network hardware, storage arrays, firewalls and  backup media only to those that are required for efficient operations; 
  • Log access to secure facilities, ideally with electronic authentication.  

Controls for Servers

  • Keep servers from being accessible directly from the Internet, (i.e., must be behind a  firewall or not connected to a larger network) and disable unnecessary services. It is  better to begin with a server image that disables all non-essential services and restore  those that are needed than to start with a full-featured image and disable unnecessary  services;  
  • Enforce principle of Least Privilege to ensure that individuals and/or processes grant only the rights and permissions to perform their assigned tasks and functions, but no more;  
  • Secure controlled-access genomic and phenotypic data on the systems from other users (restrict directory permissions to only the owner and group) and if exported via file sharing, ensure limited access to remote systems; 
  • If accessing systems remotely, use encrypted data access (such as Secure Shell (SSH)  or Virtual Private Network (VPN)). It is preferred to use a tool such as Remote Desktop  (RDP), X-windows or Virtual Network Computing (VNC) that does not permit copying of  data and provides “View only” support;  
  • If data are used on multiple systems (such as a compute cluster), ensure that data  access policies are retained throughout the processing of the data on all the other  systems. If data are cached on local systems, directory protection must be kept, and  data must be removed when processing is complete. Requesting investigators must  meet the spirit and intent of these protection requirements to ensure a secure  environment 24 hours a day for the period of the agreement.  

Source Data and Control of Copies of Data  

Approved users must retain the original version of the encrypted data, track all copies or extracts and ensure that the information is not divulged to anyone except authorized staff members at the institution. ICGC therefore recommends ensuring careful control of physical copies of data and providing appropriate logging on machines where such data are resident.  Restrict and monitor outbound access from devices that host controlled access data.  

Destruction of Data  

  • Data downloaded from ICGC-designated data repositories must be destroyed when they  are no longer needed or used, and in any event after DACO approval for the project  expires. Delete all data for the project from storage, virtual and physical machines,  databases, and random access archives (i.e., archival technology that allows for deletion  of specified records within the context of media containing multiple records);  
  • Investigators and Institutions may retain only encrypted copies of the minimum data  necessary at their institution to comply with institutional scientific data retention policy and any data stored on temporary backup media as are required to maintain the integrity of the institution’s data protection program. Ideally, the data will exist on backup media  that is not used by other projects and can therefore be destroyed or erased without impacting other users/tenants. If retaining the data on separate backup media is not possible, as will be the case with many users, the media may be retained for the  standard media retention period but may not be recovered for any purpose without a  new DACO Access request approved by the DACO office. Retained data should be  deleted at the appropriate time, according to institutional policies;  
  • Shred hard copies and other non-reusable physical media;  
  • Delete electronic files securely. For personal computers, the minimum would involve deleting files and emptying the recycle bin or equivalent with equivalent procedures for  servers. Optimally, use a secure method that performs a delete and overwrite of the  physical media that was used to store the files. Ensure that backups are reused (data  deleted) and any archive copies are also destroyed. Destroy media according to  suggested NIST Guidelines for Information Media Sanitization (http://csrc.nist.gov/publications/PubsSPs.html). 
5. Additional Guidance for Cloud Computing  

Due to the inherent Internet-connected nature of cloud computing as well as the potential issues  introduced by multi-tenant computing, institutions that wish to use cloud computing must work with their cloud service provider to devise an appropriate security plan that meets the additional Best Practices described below: 

General Cloud Computing Guidelines  

  • Whenever possible, use end-to-end encryption for network traffic. For example, use the  Secure Shell (SSH) protocol to encrypt traffic between you and your instance. Ensure  that your service uses only valid and up-to-date keys and/or certificates;  
  • Encrypt data at rest with a user's own keys, with a one-time autogenerated key (e.g., for  block-level encryption of temporary volumes), or with keys generated by a cloud  vendor’s key management service;  
  • Use security groups and firewalls to control inbound traffic access to your instance.  Ensure that your security profile is configured to allow access only to the minimum set of  ports required to provide necessary functionality for your services and limit access to  specific networks or hosts. In addition, allow administrative access only to the minimum  set of ports and source IP address ranges necessary;  
  • Be aware of the top 10 vulnerabilities for web applications and build your applications  accordingly. To learn more, visit Open Web Application Security Project (OWASP) - Top  10 Web Application Security Risks. When new Internet vulnerabilities are discovered, promptly update any web applications included in your Virtual Machine (VM) images; 
  • Review the Access Control Lists (ACLs), permissions, and security perimeter to ensure  consistent definition.  

Audit and Accountability

  • Ensure that data is accessible only to those approved for access, and controls for  changing that access are retained by the investigator who submitted the DAR and the  appropriate IT staff. A mechanism for monitoring and notification needs to be in place to  monitor changes in permission changes;  
  • Ensure that account access is logged along with access controls and file access and this  information is reviewed by the investigator on a regular basis to ensure continued secure  access.  

Image Specific Security  

  • Ensure images do not contain any known vulnerabilities, malware, or viruses. A number  of tools are available for scanning the software, such as Chkrootkit, rkhunter, OpenVAS  and Nessus;  
  • Ensure that Linux-based Images lock/disable root login and allow only sudo access.  Additionally, root password must not be null or blank;  
  • Ensure that images allow end-users with OS-level administration capabilities to allow for  compliance requirements, vulnerability updates, and log file access. For Linux-based  Images, this is normally through SSH, and for Windows-based virtual machine images,  this is normally through RDP. 

Best Practices for Specific Cloud Service Providers:  

Examples of cloud service provider best practices are provided in the links below, links to the  best practices of additional cloud service providers will be periodically appended to this  document when they become available. Please be aware that these are provided for  convenience only, and do not imply endorsement by the ICGC for any of these services. The  ICGC recommends that investigators consult with their cloud service provider to ensure that  they are using the most up-to-date best practice documents.  

Amazon Web Services:  

Google Cloud Platforms:  

Others Sources of Information for Cloud Best Practices:  

Examples of cloud best practices from organizations that leverage the cloud are provided in the  link below. Links to additional documentation will be periodically appended to this document  when they become available. Please be aware that these are provided for convenience only,  and do not imply endorsement by the ICGC for any of these services. The ICGC recommends  that investigators consult with these organizations to ensure that they are using the most up-to date best practice documents.  

6. Information Technology (IT) Security Assessment and Access Checklist

As part of the ICGC ARGO Data Access Agreement you are expected to observe basic information security practices as described therein. At a minimum, You MUST agree to the following:

  • Physical security– ICGC Controlled Data will be maintained in secure physical environments on physically and secure computer systems, such as in a locked office. ICGC Controlled data must be encrypted when:
    • Stored on a laptop
    • Stored on a memory stick or portable hard drive
    • Stored or exchanged on other portable media such as CDs, DVDs
    • Exchanged with external organizations or individuals

Controlled data should not be stored or accessed on mobile phones or tablets unless  appropriate security measures are in place.

  • Access security – Only individuals who are listed on an Approved and active DACO Application will have access to ICGC Controlled Data. If copies of the ICGC Controlled Data are stored locally on a shared computer system or a file server, then they must be strong password and/or encryption protected so that only the individual’s named in the application have access to it. If the computer that holds ICGC Controlled Data is backed up, the backup media must be encrypted and/or stored in a physically secure location.
  • Network security – If ICGC Controlled Data are stored on a network-accessible computer, there must be controls in place to prevent access by computer “hackers”, or contamination by viruses, malware and spyware. Network security is usually implemented by Your institution's IT department and will consist of some combination of network firewalls, network intrusion monitoring, and virus scanning software.
  • End of project – After finishing the Research Project for which you are requesting access or if Your access approval is terminated, You must securely destroy all local copies of the ICGC Controlled Data, including any backup copies. However, if necessary, you may still keep the ICGC Controlled Data for archival purposes in conformity with national audits or other legal requirements.
  • Training – Everyone who will access and/or use ICGC Controlled Data must be trained in confidential data handling the responsible use of personal health information, familiarized with the terms and conditions of the Data Access Agreement, and briefed on Your security plans. Training should include GDPR and Cyber Awareness Training and all Users must take steps against unauthorized data disclosure in accordance with The University of Glasgow confidential data handling guidance.   
  • Compute Cloud Use  – You may place copies of ICGC Controlled Data on a private or commercial compute cloud for analytical purposes. If You do so, You acknowledge that You maintain responsibility for the data and You agree that: You must take care to apply strong encryption to the data while in motion and at rest; restrict access to stored copies of the data to Yourself, authorized personnel, students and authorized collaborators; use firewall rules to restrict ingress and egress from virtual machines to trusted network address(es) keep virtual machines that host controlled data up to date with security patches; and destroy all copies of the data, including snapshots and backups, at the end of the Research Project or if Your application is not renewed; and ensure there is an agreement in place with Your cloud provider that ensures You can meet these requirements. Any use of a private or commercial cloud is between You and the cloud provider. To the extent permitted by law ICGC accepts no responsibility for any interaction between You and the cloud provider and is released from any liability arising out of or in any way connected with such interaction.

Access to ICGC Controlled Data is a procedure that entails legal and ethical obligations. You and Your institution must have a modern, up to date, information technology (IT) policies in place that must minimally include the following items:

  • Logging and auditing of access to data and to computer network
  • Password protection to computer network
  • Virus and malware protection to computers on computer network
  • Auditable data destruction procedure, when necessary
  • Secure data backup procedure, when necessary
  • Strong encryption on any portable device which may store or provide access to ICGC controlled access data
  • Privacy breach notification


1 Portions of this document have been adapted from the NIH’s “NIH Security Best Practices for Controlled Access Data Subject to the NIH Genomic Data Sharing (GDS) Policy”;