The Web sites were then examined by 1 researcher (J.J.W.) to see if they included privacy statements. This review included looking at the page that the link led to, the Web site’s homepage, any Web site policies, “about us” type of material, etc. Links to privacy foundations such as Health on the Net were considered to represent privacy statements. If a privacy statement was found, the text was copied into Microsoft Word 98 for Macintosh (Microsoft Corporation, Redmond, WA) and the documents’ Flesch Reading Ease score was generated using the built-in software in Microsoft Word 98 (Macintosh). The same privacy statements were printed and then analyzed using 2 hand-calculation methods: the Fry Readability formula and the SMOG method (simplified measure of gobbledygook).15 If a Web site was found to have no privacy statement, this was confirmed by a second researcher (M.G.). Web sites on which no privacy policy was found after 2 searches were not contacted to determine if a policy exists. It is not likely that end users would contact a site to determine the existence of a privacy policy. Thus, there would be the de facto absence of a privacy policy with respect to the end user. Ten percent of the readability levels were analyzed by a second researcher (D.M.D.) and found to have accurate coding. All data were entered into Microsoft Excel spreadsheet and analyzed using the built-in statistical formulas. We did not seek to evaluate the content of the privacy statements themselves.
Approximately 40 different readability formulas are available for use, all of which will give a reasonably accurate grade level (generally plus or minus 1 grade level with 68% confidence: an acceptable standard in the field).16 Most of these formulas rely on number of syllables in a word and sentence length to judge readability. In general, readability is easier when there are a low number of words per sentence, characters per word, and syllables per word, and a low percentage of passive sentences.16 Illustrations and tables also improve the readability. The Flesch Reading Ease score is one of the most widely used and validated systems for scoring readability. It is the standard used by the insurance industry for consumer documents and contracts.17,18 Documents scoring 70 or above are described as “easy” and are written at the grade school level. A score of 60 to 70 is described as “standard” and written at approximately at the high school level. Scores below 60 are described as “fairly difficult,” “difficult,” or “very difficult” as the score decreases.19 The Fry formula is a hand-calculation method that is recommended by experts in the field for use with patient education materials. The SMOG is another hand-calculation method commonly used for evaluation of health information.16 Results from the Fry formula and SMOG methods are expressed as standard United States grade levels.
Results
Of the 80 health Web sites in the initial sample, 2 had recently closed down and an additional 24 (30%) (including 23% of the commercial Web sites) did not have a privacy statement. Thus, the privacy statements of 54 Web sites were analyzed using the 3 readability formulas. The Web site domains for the 80 Web sites as well as the 54 with privacy statements are shown in the Table. The average Flesch Reading Ease score of privacy statements was 39 or “difficult.” This level is similar to reading a corporate annual report. The Flesch Reading Ease score range was 24.4 to 54.2. This range is described as “very difficult” (eg, similar to reading legal contracts) to “fairly difficult” (eg, similar to reading novels such as the Henry James novel, The Ambassadors).19 The Fry formula had a readability level equal to 14.6 or 14th grade 6th month (range, 10–17). The overall SMOG readability level was 14.7 (range, 11–19).
TABLE
Domains of all Web sites examined and percentage with privacy statements
Top level domain | All Web sites examined (n = 80) | Web sites with a privacy policy (n = 54) |
---|---|---|
Commercial (.com) | 48 (60%) | 37 (68.5%) |
Organizations (.org) | 14 (17.5%) | 8 (14.8%) |
United Kingdom (.uk) | 7 (8.8%) | 5 (9.2%) |
Government (.gov) | 3 (3.8%) | 3 (5.6%) |
Network (.net) | 2 (2.5%) | 0 (0.0%) |
Educational (.edu) | 2 (2.5%) | 1 (1.8%) |
Web site did not exist* | 2 (2.5%) | 0 (0.0%) |
South Africa (.za) | 1 (1.2%) | 0 (0.0%) |
Numerical (no domain listed) | 1 (1.2%) | 0 (0.0%) |
Total | 80 (100%) | 54 (100%) |
*Although the list of the top 25 health-related WWW sites was current, 2 of the sites had gone out of business and were no longer available. |
Discussion
The goal of this study was to determine the readability level of privacy statements on Internet health Web sites. Privacy statements are meant to protect the individual, a particularly important endeavor when dealing with medical information because of its potentially sensitive nature. Only 70% of the health Web sites examined in this study even contained a privacy statement. The readability levels of the privacy statements found were considered difficult and would require approximately 2 years of university training to be comprehensible. In contrast, only 60% of diabetic patients in 1 study could understand information written at the 6th grade reading level.20 Most studies find that only a minority of patients can comprehend information written at a 9th grade reading level.21-24 Overall, for 90% of adults to be able to read and comprehend written materials, the materials should be written at less than the 8th grade reading level.16 Although current Internet users may be a more educated group than the general population, this trend is changing. A recent study found that 21% of individuals with less than a high school education have Internet access, as do 43% of high school graduates.25 Because an individual’s actual reading level is usually 2 to 5 grades below the grade completed,26 many of these individuals will not be able to comprehend a Web site’s privacy policy and thus may not be protected or understand their options pertaining to protection.