Hidden data and backups
Overview of file erasure
In most filesystems deleting a file means that the disk space containing the file is marked "free". It means that the operating system can now overwrite this space when it needs to. However, exactly where and how a file is written on a disk depends on the filesystem and operating system in use, so an ordinary user has no way of knowing when a specific disk region is overwritten with new data.
As long as the deleted file is not overwritten with new data, it remains accessible (with forensics software). Hence, to securely delete a file, it must be overwritten at once (in the case of hard disk drives).
Securely erasing the whole disk
It is advisable to securely erase the whole disk if the disk (or the computer along with the disk) is sold or given to somebody else. The methods of performing secure erase depend on the type of disk.
For traditional hard disk drives (HDD), the data can be securely deleted by overwriting it. In most cases, secure deletion software overwrites the deleted data many times with zeros and/or random data. For example, one can use Darik's Boot And Nuke to securely delete the whole disk, but many alternatives exist.
For solid-state drives (SSD), overwriting the whole disk stresses the disk too much and does not work as intended. SSD-s have a built-in controller that controls where and how data is written so that all disk areas are worn out evenly. This means that the operating system does not know where exactly the file resides on the disk.
To securely delete files on SSD, it must support the TRIM functionality. This allows the operating system to tell the disk controller which data blocks are no longer in use. Unfortunately, many older SSD-s do not support TRIM functionality. Moreover, its support is determined by the combination of the disk, operating system, file system and the way disk is connected to the computer. USB flash drives also do not support TRIM functionality. More information can be found from the following report SSD Forensics 2014.
For example, TRIM is not supported if:
- SSD is part of a RAID array
- SSD is connected by FireWire
- SSD is used as an external disk with USB
- SSD is part of Network Attached Storage (NAS)
- the operating system is too old
- if a file system other than NTFS is used in Windows
A ATA Secure Erase command has to be used to securely erase the whole SSD if the disk supports it. This command temporarily raises the voltage used by the drive and thus destroys the electrons in the flash memory.
Secure erasure software
Windows
Securely deleting individual files on HDD-s follows the same logic as erasing the whole disk: the files must be overwritten with zeros and/or random data several times. In Windows, there are several alternatives to choose from, for example SDelete and BleachBit.
EFF recommends BleachBit, and they have posted instructions on how to use BleacBit: How to: Delete Your Data Securely on Windows. More detailed information about Bleachbit can be found from its documentation. However, if you like Windows command-line, you can also try out SDelete software by SysInternals.
Lab exercise. Start "BleachBit" in your virtual machine. Delete a file securely by overwriting it. If you don't have a file to delete, download or create the file yourself.
macOS
In Mac OS X prior to El Capitan (10.11), the possibility to securely delete files was built into the operating system. First, the files had to be moved to Trash as usual and then choose Finder -> Secure Empty Trash... from Finder's menu. Unfortunately, this functionality is removed starting from Mac OS X El Capitan due to a security issue. Hence, a third-party tool has to be used, although encrypting the whole disk may also solve the issue in some cases. The functionality to securely erase a whole disk or partition is still built into Disk Utility (2024).
Linux
In most Linux distributions either shred or srm command-line programs are available.
Temporary files
In everyday usage, the operating system creates many temporary files to accelerate the workflow or perform some task (e.g. printing). Web browsers save browsing history, cookies, temporary and downloaded files. Windows stores log files, a list of recently opened documents, etc. Most of these files can be deleted individually, but it makes sense to use software that is designed to delete most or all of them at once.
Windows has a built-in application Disk Cleanup that it automatically invokes when the system drive is running out of space. However, it can also be run manually.
By default, Disk Cleanup prompts to delete the current user's temporary files, but it is advisable to click "Clean up system files" if the user has administrator permissions. Among others, this simplifies removing old system updates that are outdated and may take a lot of disk space.
Disk Cleanup is a quick and simple solution, but there are also alternatives with more features, e.g. CCleaner. While CCleaner has been a popular and convenient tool it has had several issues since 2017. The most significant issue involved hackers having compromised the tool for over a month before it was detected. Although this was fixed, the damage to the reputation was already done. In 2018 CCleaner installer bundled Avast Antivirus. Thus, use the tool at your own risk in case it is needed. The functionalities offered by CCleaner can be replaced by manual work or with other software that behaves similarly.
CCleaner searches temporary files from more locations (e.g. from application-specific locations) and offers more granular control of their removal. However, one must be careful as CCleaner may offer some files which are sometimes necessary for the system.
Data recovery
If secure file deletion is not used, deleted files can still be accessed until they are overwritten. This means that it is possible to recover some files that are accidentally deleted. Moreover, this method is used by digital forensics when investigating confiscated computers. Even if the file cannot be restored in full, it may be possible to obtain some meta-information (time of creation, author) that can be used as evidence.
There are a lot of file recovery tools for all operating systems. Here is a shortlist of free software for Windows:
- Recuva Free - http://www.piriform.com/recuva
- TestDisk - http://www.cgsecurity.org/wiki/TestDisk
- TestDisk can repair corrupted partitions and file systems and recover deleted files. It also works in macOS and Linux.
- PhotoRec - http://www.cgsecurity.org/wiki/PhotoRec
- PhotoRec ignores the file system and reads file identifiers from the block device directly. Hence it is possible to use it even with corrupted file systems. It also works in macOS and Linux.
Lab exercise: Start Recuva Free from the virtual machine and understand how it works.
- Task 1: Download some documents and pictures and delete them in the usual way. Try to find and restore them with Recuva Free. Now delete the files securely and run recovery software again.
- Task 2: try to find information from the partition that is named "Virtual USB". This task is a part of the first homework. The upload form for the restored files can be found under the first homework task.
Data backups
Important files should be backed up so that a hardware or software failure or careless usage by oneself does not render these files permanently inaccessible. In general, it is advised to have multiple backups to prevents the loss of data in case of one the backups gets corrupted or destroyed in an accident. Ideally, backups follow the 3-2-1 rule: have three copies of every file, use at least two different types of storage media (there's a version of the rule that suggests two different locations instead) and one copy should be offsite (or offline).
It is easy to keep smaller files (e.g. documents) in cloud storage so that you don't have to handle backups yourself. Of course, it is advisable to familiarize yourself with the terms of service to see if the cloud service provider actually provides some guarantees on the availability of data.
For an individual, backing up large files to cloud storage might not be feasible as uploading data can take time, and monthly storage fees apply. Here, using backup software to back up data to either an external or network drive might be a better idea. Of course, one can handle backups manually by periodically copying important files to an external drive, but using backup software has some advantages:
- Speed - instead of copying all files over to the backup drive, only files that have changed since the last backup are updated. This is called incremental backup.
- A couple of last versions (revisions) are kept from each file, so it is possible to undo your changes.
- Backups are done automatically, or the user is notified if a new backup needs to be done.
Backing up data is so important that most of the operating systems have some backup tools built-in. In Windows, there is Windows Backup (Windows 10 and 11). It lets the user choose which folders to back up, but the user can also choose to let Windows decide. Backups are created automatically to an external drive when it is connected to the computer, or the user is notified to connect the drive from time to time. More detailed instructions for Windows 10 can be found from: How to Back Up and Restore Your Files, Apps, and Settings in Windows (2024).
In macOS, there is Time Machine that automatically backs up all files (user-defined excludes are possible) to an external or network drive.
As there are many different Linux distributions, there is also no single backup software. Some distributions may have some backup software bundled, but mostly it is up to the user to choose one. For example, Flexbackup can be used for automatic incremental backups, or rsync for manually creating a backup system.
It is important to know that every backup system must be tested for data restore from time to time. Even years worth of backup is useless if the files cannot be restored when needed!
Long-term data retention
Most data storage media eventually wears out when used. Data on storage media can become corrupted because of the surrounding environment: humidity, temperature, radiation and salts. Moreover, technology evolves so rapidly that data storage media may become unreadable because:
- there are no devices to work with that type of media
- there are no compatible connections to connect the storage device with
- no drivers are maintained that work with these storage devices
- no maintained software can read or write this storage media anymore
Some examples:
- 3,5" floppy disks
- video cassettes
- console game cassettes
- many new computers do not have optical drives (CD/DVD)
- many tablets do not have USB host ports
In Estonia, National Archive also deals with preserving digital documents and thus has a long-term data retention policy. For example, text documents are archived in either plain text, XML or PDF format. Images are archived in TIFF or PNG format, sound in uncompressed WAV and videos in MPEG (source, in Estonian).
More information about the life span of different storage media can be found from: here and here. There is also a nice illustration about the estimated lifespan of media. source https://www.crashplan.com.
Further reading
- Forensics
- SSD forensics
- Backups and lifetime of storage media