Everyone talks about the importance of online privacy, and savvy surfers know how to manage their personal data. But how much can you really tell about what information you’re sending when you search — and who is able to see that information?
Browsers collect information in two separate ways, loosely separated into online and offline history. Your browser maintains an ongoing history of sites that you visit, as well as recording any other IPs that are connecting to you while you search. Visiting any single web page may involve connecting to multiple IP addresses for images, video, advertisements and background processes (such as Java and Flash). For simplicity’s sake, your browser history will display only the page that you visited (the address in your browser’s address bar), but also record all of those “hidden” IP addresses as well.
This browsing history is saved offline, accessible only by someone with access to your device. However, your browsing history may also be saved online in various ways. Every IP connection is saved in one or more server logs, and all visits to a given website may be stored anonymously on a server database. That’s essentially how Google (or any other search provider) ranks the popularity of sites to provide more useful results — the more unique IPs that are recorded, the more popular the website is. Additionally, your browser history may be saved online if it is linked to an online account, such as using Chrome with a Google Account, or Safari with an Apple ID.
HTTP Headers and Server Logs
All browsing history is technically “recorded” in the sense that server logs exist for all Internet communications. Whenever you visit a website, a “note” is made of the connection between your IP address and the IP address of the site, as well as the date and time of the connection. Depending on how verbose the header is, this “note” may also include browser and operating system information (as well as screen resolution and color depth — helpful info that allows the sites you visit to display their web pages properly on your browser). This is known as the “user agent string” and is part of the HTTP header that is sent and received with every Internet communication.
Who can see this info? Technically, anyone — any website that you visit, and any servers along the way. It’s the 21st Century version of Caller ID — anybody can see the number that you’re “calling from,” and that number may provide general identification and location information. Your phone number may be recorded by anyone that you call, but they have to let you know if they’re recording the content of the call (e.g., “for quality purposes”). Likewise, any website or online service that collects information beyond the basic HTTP header strings will only do so with your consent.
Consent and Confidentiality
When you use an application or online service that collects information, you are bound by the Terms of Service or EULA (End-User Licensing Agreement). EULAs involve explicit permission, such as clicking the “I agree” button at the bottom of a pop-up or installation dialog. When you install an app on your phone or tablet, you’ll be presented with the specific kinds of information that you’re allowing the app to access. This can go far beyond the simple HTTP header info, so it’s always in your best interest to read the EULA before agreeing.
Google is able to provide Search for free by selling the data that it collects. This is the model for virtually every “free” online service — you agree to “pay” with specific kinds of data. In the case of Google Search, the basic data that you provide is your IP address and the search queries that are sent from that address. Google’s Terms of Service allow the company to provide that information to advertisers, who can then deliver increasingly appropriate ads via Google products accessed by your IP address in the future. For more advanced services, you’re often required to “pay” with more data.
Cookies refer to data that is saved by the browser during a visit to a specific website. If that website is visited again, the site will access the cookie to get information in addition to the basic HTTP header data. This data can include nearly anything, up to and including personal information such as name, address, passwords and credit card information. For that reason, many cookies are encrypted with the highest level of data security — and your browser will give you several ways to manage, deny and delete them.
Cookies were designed as a way to provide the Internet user with a consistent and convenient experience. Without cookies, any website that required a log in would ask for your username and password every time you navigated to the site — or even to different pages on the same site. This enables things like persistent shopping carts, as well as the “history” of selections on the website — including preferences, filters and past site searches.
Browsers provide you with the means to view saved cookies, as well as to delete any or all of them. Every modern web browser also provides some method of “private browsing” which prevents the browser from sending any data other than the IP address, and also from recording that session’s activity in the browser’s history log. This still allows websites to register the incoming IP and header information, but disables the use of ongoing cookies (existing cookies won’t “know” that you’re browsing, and any new cookies won’t be saved once the browser is closed). Advanced options include “blacklisting” specific cookies, so they’ll never be saved by your browser, or even “whitelisting” to only allow specific cookies of your choosing.
Profiles and Accounts
Basic data can be combined with more personal data, but only with your consent. For example, if you are signed in to a Google Account while you search, your profile data is linked to search history, and cross-referenced between Chrome, Search, YouTube, Android apps, Gmail and other Google products. This can include a lot of data that you may prefer to keep private, so Google also provides you with various ways to keep track of your information — including the Google Dashboard, an absolute must-see site for any Search user with a Google Account.
Other online services store data and specify what kinds of data is available to others. Social networking sites often have various levels of sharing, in which certain personal data is hidden from the public but accessible by friends. Likewise, the site’s Terms of Service should inform you of what data is stored, including searches performed on the site or hyperlinks shared to the site, and to whom the data can be made available. This can become complicated to manage, depending upon the site’s user controls and the transparency of its agreements with third parties such as app developers and advertisers.
Even with only an IP address to work with, you may be surprised at just how accurately these bits of anonymous and incomplete data can anticipate your likes and preferences. The more you agree to add more personally identifiable information to the mix, the Internet becomes increasingly efficient and convenient every time you search. Along with this tailored experience comes increasing responsibility to be aware of the tools at your disposal to manage and protect your personal data.
Find John on Google+