Follow the Money, A Handbook
You've probably discovered already that if you're interested in following the money in politics, OpenSecrets has done most of the heavy-lifting for you. But if you want to dig around in Federal Election Commission data on your own--or maybe you want to research state- or local-level data, then you've found your guide.
Do-It-Yourself Data-Digging
Published by the Center for Responsive Politics in 1994, this handbook was written especially for journalists by our former executive director, Larry Makinson, who developed the system we use to classify campaign contributions while he was a newspaper reporter. The methods outlined in the selected chapters below apply whether you're vying for a Pulitzer Prize or just muckraking in your town. And if you just want to know more about how our hard-working staff produces the campaign finance research that's on this website, the "Follow the Money" handbook will fill you in.
Follow The Money (Handbook)
Getting Started
Getting to the bottom of the bottom line in political contributions can be a challenging and time-consuming task, but it's guaranteed to be a rewarding one. Even under the worst-case scenario - you're typing everything into the computer yourself, the contributors don't list their occupations or employers - you are guaranteed to learn things about the way politics works that you'd never have known without digging into the records.
While the process of hand-entering thousands of contribution records into a computer is no one's idea of a great way to pass the time, there is much to be said for the process of osmosis that inevitably occurs while you're doing it. The same names will pop up over and over again, as you begin to ferret out the top-spending lobbyists and other political rainmakers who are well-known to the politicians, yet all but unknown to anyone else. You'll see clusters of contributions from both expected and unexpected places. And, if you dig through enough records, you're almost guaranteed to find a scandal or two. Politicians are not always as careful at covering their financial tracks as one might think.
My own first experience at digging through campaign finance records was in Alaska in 1985. One of the most interesting discoveries was a cluster of $1,000 contributions to an incumbent state senator that all arrived the same day from an unlikely collection of givers. A total of $21,000 came in the door that day, from people none of the political reporters had ever heard of. They all lived in the Matanuska Valley, north of Anchorage, and they listed occupations like "farm hand," "construction worker," "potter" and, inevitably, a number of "homemakers" and "housewives."
Two of the contributions came from a husband-wife team. The woman owned a dairy, the man a construction company. Armed with the list of contributors and their addresses, I drove up to the valley to knock on doors. It didn't take long to find out the story. The contributions had all come from employees of this enterprising couple, and from the employees' spouses. As one contributor admitted to me - and later to a state inquiry - each employee was given a $1,500 "tool allowance" that was to be deposited in their personal checking accounts. They were then to write a $1,000 check to the senator, also from their personal checking accounts. The checks were then collected and hand delivered.
The dairy, it seemed, was nearing the deadline for complying with a state homesteading law that required a certain amount of work to be done to qualify for the homestead. They needed an extension or a waiver. The senator was in a position to recommend it. And so the $21,000 was rounded up and delivered.
The scam was uncovered simply by looking at campaign finance records, noticing a suspicious pattern - the similar amounts, the dates of the checks, the suspiciously blue-collar occupations of such big donors - and following up with old-fashioned legwork. The same kinds of stories, with local variations, are almost certainly buried in dusty filing cabinets from Olympia to Tallahassee and everywhere in between. All they await is someone digging into the records, systematically transferring the data from filing cabinets into a computer, then sifting through it to find what's there.
A number of other surprises were also uncovered in that initial investigation. A $20,000 bundle from a collection of apparently unconnected Seattle residents eventually was traced to an Alaska lobbyist acting on behalf of a Canadian corporation. (That one was caught because the givers used sequentially numbered cashier's checks to make their contributions. Alaska requires the check number to be recorded on each contribution.) Then there were the recurring contributions from two Anchorage businessmen whose donations came not under their own names, but from 21 separate corporations they jointly owned. The contributions were typically for $250 each, but they were bundled - usually in groups of $1,000 or $2,000 - and delivered to favored candidates. When all the donations were added up, these two business partners turned out to be among the biggest contributors in Alaska. No one ever knew until we started digging through the filing cabinets, putting the contribution data on computer and sorting through it to see what we would find.
The same rewards await anyone who takes the time and effort to dig through the public records. From my own experience, and from that of other researchers who've dug through records on their own, I now recognize the symptoms of what happens when you begin to do this work. You work longer hours at the computer than you ever thought you would. Your eyes may be bloodshot, your fingers fighting off cramps, but there are few rewards so sweet as that adrenaline-pumping "bingo!" or "gotcha!" when a pattern jumps out at you, or the missing link in a seemingly unrelated collection of contributions suddenly falls into place. There's a lot to be said for the little discoveries you uncover simply through osmosis while sorting through the records. It's a nice reward for all the work.
Still, if you're going to try to dig out the facts behind who's underwriting the campaigns of your state legislature or city council or governor, there are many techniques, shortcuts and tools of the trade that will make the job easier and more fruitful. This section of the Handbook gives you the tools you'll need to get the job done.
THE SEVEN STEPS
Before we get into the nuts and bolts of doing the job, it's worth an overview of what the job will consist of. Basically, there are seven major steps in identifying the money that pays for elections:
Set the scope of your project and gather the records. Are you looking to do the whole legislature? The governor's race? Your congressional delegation? Step one is figuring out how big your project will be, then gathering the records. To help get you started, the Handbook reviews on pages 59-69 what data is available in each of the 50 states and where to get it.
Set up a database. Next you have to get your computer ready to receive the data. Thankfully, nothing fancy is needed. Any off-the-shelf database program will do, and you can be up and running and ready to enter your records in less than an hour. There's no limit to the sophistication you can build into your database later if you want to, but for now all you need is a simple structure for holding the data you've collected.
Enter and standardize the data. If you're dealing with paper records, entering them into the computer will be your first big job. There are plenty of shortcuts that can help speed data entry, as well as things you should watch out for as you're progressing through the stacks of paper records. If you're importing records from a disk, or via modem, the data entry is already done and all you need to do is load it into your database. Either way, once the data is in, you'll need to clean up inconsistencies and standardize the names of contributors and their employers.
"Fingerprint" the contributors. This is the process of expanding the information you have on each contributor - assigning an ID number, filling in their occupation/employer or ideological interest, and identifying spouses and children who may also be contributing. The most time-consuming step here is identifying the contributors' occupations and/or employers. About half the states, and the federal government, require this information to be listed on contribution reports. If your state does, you're a giant step ahead in the fingerprinting process.
Categorize the contributions by industry and interest group. This is the step where you begin to hit paydirt. Instead of simply working with a list of names, companies and PACs, you'll now be classifying each contribution into its own industry or interest group category. Here's where you'll begin to see the real patterns - how much lawmakers are getting from dairy farmers, how much from lawyers and lobbyists, how much from the cable TV industry, securities brokers, public employee unions, or insurance executives. To do this, you'll need to do some digging in reference libraries, or plug in to one of the growing numbers of CD-ROMs that list companies by business type.
Look for patterns. Play with the data and see what you've got. This is the real fun. Now comes the time to sort through all the records, exploring what you've collected. Calculate the top political contributors in your state. Find out which industries have been targeting specific committees with generous helpings of campaign funds. Compile contributor profiles for key politicians, identifying their cash constituents whose identities were unknown before your research began.
Graph the most interesting patterns, and write your stories. After all the research, this is the time to put some flesh on the bones of your findings - doing the interviews, presenting the data in charts and graphs that will paint the picture so anyone can follow what is going on. Ideas on how to do this will come in Part III of the Handbook, "Reporting the Story." As for now, it's time for the nuts and bolts of how to get rolling.
Setting the Scope of Your Research
The first question facing you as you explore the possibilities of investigating money in politics is how big a chunk of contribution records you should try to examine. This question obviously depends on the resources you've got available - computer equipment, staffing, and time. It also depends on what state you're looking at. Categorizing all the contributions for the North Dakota legislature is one thing; doing it for California is quite another. No matter which state you examine, time and staffing are the crucial variables, since you can put together a database on virtually any computer you've got at your disposal. All you need is enough storage space on your hard drive, and just about any off-the-shelf database program. Even a spreadsheet will do for the data-entry part of the job if that's all you've got.
The ideal scope for a project would be a database that covers the entire state legislature, plus the governor and other top statewide elected officials. On a more local level, the entire city council is an obvious target, as are the candidates for mayor and, possibly, county commissioners. One way to cut down your workload is to restrict the research to only those candidates who were actually elected. You'll miss a lot of the money, clearly, but you'll have the most important data that you need when reporting on legislative issues. Another way to cut it - again, a compromise, but one to consider - is limiting your database to only those contributions over a certain amount, say $100 or $250 and above.
If the whole state legislature is too big a chunk to start with, a good way to pare down the scope of your project is to restrict it to one part of the legislature. You could do just the state Senate or the House, you could concentrate on the house and senate leadership, or you could focus on one or a few key committees. If you can't take on the whole legislature, the next best thing is to do it by committee. Though voters rarely give it a thought, the real nuts and bolts work of the U.S. Congress, and most state legislatures, takes place at the committee level. In the Congress, and in the states, certain of those committees are important centers of power - and focal points for intensive lobbying and energetic contributions.
The House Ways and Means Committee, for example, which crafts the nation's tax laws, is crucial to virtually every business (and individual) in America. The Armed Services Committees and the Defense Appropriations Subcommittees can spell fiscal life or death for defense contractors. The House Energy and Commerce Committee - little known outside Washington - sets national policy for health care, telecommunications, the oil and gas industry, the securities and financial industries, electric utilities, railroads and a wide swath of other important industries. A seat on that committee, which virtually guarantees a generous supply of PAC contributions year after year, is one of the most sought-after assignments on Capitol Hill. Similar powerhouse committees exist in every state capitol. Focus your investigation on those committees and you will find the biggest centers of campaign funds in your state. You will likely also find the most direct correlations between campaign contributions and legislative actions. So if you can't do the whole legislature right away, start with a few top committees and expand your research later.
One important point on committees: if your state senators have longer terms than state house or assembly members (which almost every state does), concentrate on the lower house first. When lawmakers have to run every two years, their complete fundraising cycle will coincide with the normal two-year election cycle. In the U.S. Congress, where senators run only once every six years, you've got to look at a full six years of history (or three two-year election cycles) to get an accurate view of all the money going to a particular senate committee. Senators typically raise most of their money in the two years leading up to their reelection race, so if you review fundraising for the Senate Armed Services Committee in the 1992 election cycle, for example, you'll find that the biggest recipients of campaign cash were those with races in 1992. To get the full picture of who's been getting what you need to look back six full years. Keep that important point in mind as you begin your project. If you're doing your project committee-by-committee, it makes a lot more sense to start with the lower chamber and work up - particularly if you're short on time or resources.
STRATEGIC ALLIANCES
This is a good a point to bring up a subject that ought to be considered as you're beginning to plan your project: is it important that you do it alone, or might it be possible to enter into a strategic alliance with another organization to help with the work? In some places, the competition between news organizations is so intense that the thought of a joint venture would be enough to scuttle the project outright. Two competing newspapers in the same city might be loathe to cooperate on anything, let alone on a database that could provide a rich lode of stories for months and years to come. But many other partnership possibilities may exist. Two Florida newspapers - the Miami Herald and the St. Petersburg Times- cooperated for a number of years in compiling a database of contributions to the Florida legislature. Other papers joined in at various times. Their readership areas, by and large, didn't overlap, and each paper successfully mined the database for unique stories.
Since ongoing analysis of the state legislature would probably be the highest priority in just about every state, similar cooperatives could be worked out almost anywhere, if the news organizations are willing to share resources.
Another possibility is teaming up with an academic or other non-profit research organization. A partnership with a university research group could provide a great deal of assistance to a news organization, as well as supplying rich material for class projects and case studies for the academics. You may well want to do a project on your own, but if you could use a hand with resources, and don't mind sharing at least a portion of your findings, don't rule out the idea of forming a partnership with another organization.
SETTING UP A CONGRESSIONAL DATABASE
There's one other option you might want to consider, if the thought of entering thousands of records is too daunting - set up a database of contributors to congressional candidates. A logical slice here would be to include all the members of your state delegation, plus any current candidates. Frankly, this is something every news organization ought to have, both in its newsroom and its Washington bureau. It's also an excellent (and mostly painless) way to get started with a contributor database, since the data is already available electronically from the Federal Election Commission. Even better, you can get a database of contributions already coded into industry and interest group categories from OpenSecrets.
OpenSecrets' coding process is not a quick one - we tend to be several months behind in PAC contributions and much longer behind in coding individuals - but at least all the work (or most of it, anyway) is already done. All you have to do is set up the database structure and import the records. The same is true of the FEC data (you just need to import the records), but the FEC doesn't standardize employers or apply category codes to PACs and individual contributions. Nevertheless, there's no data entry involved with a congressional database (unless you're trying to keep current with the latest election year filings), so the biggest single labor-intensive part of setting up the database is already done.
Collecting the Data
The starting point for your research will be the records available at the state, local, or federal agency that tracks the campaign contributions you're interested in. Since different jurisdictions have different disclosure laws - and since some offices are more computerized than others - there's a wide variance in what your actual starting point will be.
At worst - and this is the case in most states - you'll be dealing with paper records that have never been computerized. More and more states are beginning to put some of their contribution data on computer, and the federal government has done it for years. But most are still not there (and many will never be there due to budget constraints and a lack of interest) - all of which means your first job will be collecting copies of the paper records at the local office that handles them, and keying them into your computer by hand.
All the states have established prices for copying records, but keep in mind that these are designed for the typical customer who walks in the door -not for a news organization that wants not just a few reports, but whole filing cabinets full of records. Don't hesitate to try to negotiate a lower copy rate per page, or to bring your own paper - or even a portable copier - to their office. You might also try a little bartering - giving the office access to some of your data when you're done with it, or something else that's useful to them.State elections offices are almost always in a delicate political position. They are invariably under-funded and overworked, their job is to regulate the very politicians who control their budget, and there are always pressures from lawmakers not to be too eager in their work. Nevertheless, nearly every office has one (or many) staffers who will be willing to bend over backwards to help you do the job that they themselves don't have the power or resources to do. Tap into those people, and every phase of your job will be easier.
WHAT DATA DO YOU WANT TO COLLECT?
Your first question before beginning to collect the data for your research is figuring out exactly what data you need. There are two considerations here - what's available in your state and how big a chunk you can bite off without being overwhelmed.
In every state, candidates for public office must file periodic reports of the money coming into their campaign and going out of it in expenditures. The records you're interested in (at least for the scope of research outlined in this book) is the money coming in - the contributions received.
When candidates file their reports, they list two specific kinds of information. On the summary page of their report, they'll list the totals - how much money has come in to their campaign in the last reporting period, and how much they've spent. Typically, they also include additional information, such as the candidate's current cash on hand, and the running totals of their contributions and expenditures over the past year. These summary pages are valuable in their own right, particularly when you're writing election season stories under tight deadlines. They offer a quick comparison showing the amounts raised by different candidates, and they're often used to informally handicap who the 'serious' candidates are. But for researching the source of the campaign funds, you can skip the summaries for now and go straight to the detail pages.
In every state candidates must itemize all contributions over a certain amount. The threshold varies from state to state. At the federal level, all contributions of $200 or more must be itemized. In a few states, all contributions must be itemized. But in most, contributions smaller than the threshold amount can be reported simply in lump sum.
The part of the reports that you're interested in are the itemized contributions. Each of these entries will typically include the name, address, and sometimes the occupation of the contributor, as well as the amount given and the date of the contribution. That information is the core of what you'll be putting into your database.
OpenSecrets' coding process is not a quick one - we tend to be several months behind in PAC contributions and much longer behind in coding individuals - but at least all the work (or most of it, anyway) is already done. All you have to do is set up the database structure and import the records. The same is true of the FEC data (you just need to import the records), but the FEC doesn't standardize employers or apply category codes to PACs and individual contributions. Nevertheless, there's no data entry involved with a congressional database (unless you're trying to keep current with the latest election year filings), so the biggest single labor-intensive part of setting up the database is already done.
Setting up a Database and Entering Data
Computer-assisted reporting is a hot-button topic around newsrooms these days. Some news organizations have made computer-assisted reporting a major priority, and are using mainframes, nine-track tapes and powerful PCs to investigate areas that previously were untouchable. Many others have yet to step into the computer age beyond using terminals for word processing. Most are probably somewhere in between.
The good news about campaign finance databases is that they are among the easiest of all database projects to set up. In fact, putting together a do-it-yourself contributor database is so simple technically that it's the ideal project to get a reporter - or a whole news organization - up and comfortable with computer databases.
Anyone, with even the smallest of computers and the most rudimentary of database programs, can put together a database. No fancy equipment is needed, and the work can be done (if there's no support from editors) in odd hours of the day, or nights and weekends.
Organizations wanting to jump in in a big way can build as sophisticated a database as you could dream of with an industrial-strength program such as FoxPro, Paradox, or the newer generation of programs such as Microsoft Access. If you know your way around databases, pick whichever program you're most comfortable with. If you don't know a database from a spreadsheet, find the simplest, most intuitive program you can, and set it up in that. You shouldn't have to pay more than about $100 for a simple "flat-file" database. For more sophisticated systems, a "relational" database is ideal. But it's not necessary, particularly when you're getting started. Better to learn as you go using a simpler, more intuitive program, then move on to something bigger if and when you need to. (Once the data is in the computer, it's a relatively simple matter to transfer it from one program to another.)
If you're going to be hand-entering printed data into your database (which you'll likely have to do if you're looking at state or municipal records), pick a program that offers shortcuts for data entry. If you're using a Macintosh, Panorama is an excellent choice.
A FEW WORDS FOR NON-TECHNICAL READERS...
If you're a computer neophyte or computerphobe, this may be the point where you're beginning to work yourself into a nervous sweat. Don't worry. Amazing advances have been made in recent years in making computers much more friendly than they've ever been before. As a longtime Macintosh user, I long ago got used to the notion that computers ought to make life easier, not more complicated, and that software ought to be intuitive enough that you hardly need to open the manual to figure it out. Thankfully, this trend toward simplicity, and away from the mind-boggling complexities of years past, has swept beyond Macs into the PC-compatible world as well. Microsoft Windows has been the primary carrier, and though it's not yet as simple and elegant as the Mac, it's getting closer all the time.
The current crop of database programs make data entry, and development of simple, yet powerful databases infinitely easier than they were a few years ago. Entering campaign contribution data into a database is about the easiest thing you can do to get your feet wet in this brave new world. It will allow you to ease in slowly. Once you dip your toe in, you'll find the water's fine.
STRUCTURE OF THE DATABASE
Database programs allow you to take huge amounts of data and store them in your computer piece by piece, so you can rearrange them easily, sort through them, calculate totals, and basically manipulate them in almost any way imaginable. To do it, databases break up the data into individual records and "fields." A record is a single transaction - a contribution to a state senate candidate, for example. A field is an element within that record, such as the contributor's name, the amount of the contribution, the date, etc.
To set up your first campaign finance database, you should begin with the paper records and set up the computer to mimic those forms. The records you're primarily interested in are the itemized contributions to candidates. These records will typically include the contributor's name and address, the candidate's name, and the amount and date of the contribution. Each of these elements should be fields in your database.
About half the states, and the federal government, also require contributions over a threshold amount to include the contributor's occupation and employer. Of all the bits of data, this is probably the most important, since it's the one you'll use later to assign an industry or interest group code to the contribution. If your state requires this information, be sure to include extra fields for them in your database.
A handful of states require additional information. If yours does, you'll want to add that as another field, too. Alaska, for example, requires candidates to write down the check number of each contribution over $100 - a useful idea that makes it possible to identify connections between contributors that are not otherwise visible. Connecticut requires contributors to reveal whether they are lobbyists, or members of a lobbyist's immediate family. Kentucky requires statewide candidates to disclose the name and employer of the contributor's spouse (an excellent way of identifying the economic interests behind what otherwise would be a contribution from a "housewife" or "homemaker"). Obviously, these extra bits of information are valuable - if the forms you're looking at include them, be sure to include them as extra fields in your database.
Let's assume you have all the standard elements on the paper records you're working with. Here's a workable structure you can use to get started.
| Data | Field name | Length |
Field type |
| Contributor's name | Contname | 40 |
Character |
| Contributor type | Conttype | 1 |
Character |
| Candidate's name | Candname | 20 |
Character |
| Contributor's address | Address | 40 |
Character |
| Contributor's city | City | 18 |
Character |
| Contributor's state | State | 2 |
Character |
| Contributor's zip | Zip | 5 |
Character |
| Contributor's occupation | Occupation | 30 |
Character |
| Contributor's employer | Employer | 40 |
Character |
| Contribution date | Date | 8 |
Date |
| Amount | Amount | 5 |
Numeric |
Later you'll be adding extra fields - ID numbers for contributors and candidates, a "newemploy" field to hold the contributors' standardized employer/occupation, and a code that lets you classify the contribution by a specific industry or interest group. Don't worry about those fields now. First you need to get the records into your computer, and the simple setup outlined above is all you need. Once you've got your database structure, you're ready to start entering data.
A few comments on some of the fields are in order here:
Contributor's name. The traditional way to store names in computer databases is to break the name up into at least two, and possibly several fields: first name, last name, middle initial, prefix, suffix, etc. Are all these fields really necessary? Based on my own experience at working with these databases, I'd give a qualified no. It might be useful to have a first name-last name division, but even that's not really necessary - and there's at least one compelling reason why it's better to keep it all as a single field. Many of the contributors you'll be entering are not individuals, but organizations - whether PACs, unions, or corporations. Fitting their full name into the "lastname" field is going to be difficult, unless you make the lastname field 40 characters long. (And if you do that, you'll be using up lots of unnecessary disk space.) If you find you later do need two fields, you can always create them by having the computer split them apart. It's also more convenient to sort on a single field than on two fields.
If you do enter contributor names as a single field, do it in the following format: "Jones, Henry B Jr" (or Dr, or MD, etc). You'll be sorting the names later alphabetically, so make sure the last name comes first, followed by a comma, followed by the first name and any other initials or professional abbreviations.
TIPS WHEN ENTERING NAMES
When you're entering names, don't forget extra elements like "Jr", "Sr", "Dr", "Mrs," etc. Also be sure to include any extra initials at the end, like "MD", "DDS", "CPA," etc. that will help you identify their occupation.
Do not copy "Mr." or "Ms." into your database, and if you're starting out with records that are already computerized, strip away the "Mr." from the files. This will help you later when you're trying to standardize names. On the other hand, do copy "Mrs." - particularly if the name is a man's, as in "Mrs. Henry Jones."
Eliminate periods after abbreviations like "Mrs.," "Dr.," etc. and also after middle initials. It's just an extra keystroke and it doesn't tell you anything you don't already know.
Be consistent when entering names of people with two first initials. Probably the easiest is to leave a space, but no periods, between the two initials, as in "H R Haldeman." Once you start doing it this way, don't switch to "HR Haldeman," or the records won't be line up alphabetically when you start sorting.
Contributor type. Later, it will be useful to separate individuals from other types of contributors. Enter a one-letter code here to tell yourself what kind of contributor this is. You don't need to get too specific. The following codes will do:
P = PAC. Political action committee.
I = Individual.
C = Corporation or other business organization.
L = Labor union.
R = Republican Party, and its local affiliates.
D = Democratic Party, and its local affiliates.
3 = Other political parties.
Contributor's address. This is the street address of the contributor. It's likely to be one of the most complicated and time-consuming fields to enter, but it will be very useful later when you're trying to link spouses and children with the income-earner in the family.
TIPS WHEN ENTERING ADDRESSES
You don't really need to enter the address every single time, particularly if the contributor is a PAC, since you won't need the PAC's address to identify it later. If the contributor is a corporation, however, its address may well be useful, as executives from the company sometimes list their office address on personal contributions. (This also helps you confirm their place of employment in case they don't list it.)
Eliminate periods. Abbreviate wherever you can, and be consistent. Use "PO Box" instead of "Post Office Box" or "P. O. Box." Every keystroke saved is a keystroke closer to finishing the job.
Contributor's city, state and zip. These are three separate fields. They'll be useful for a variety of things later - like determining in-state vs. out-of-state contribution totals, for example, or compiling a list of the golden zip codes with the deepest political pockets. The city and state fields in particular are ones that will be repeated over and over again, so look for a database program that will allow you to "repeat" the entry from the previous record automatically. (In other words, if you've got 25 contributions in a row from "Los Angeles," let the computer fill it in when you tab to the city field. Other programs (like Panorama on the Mac) have a feature they call "clairvoyance." You type the first two or three letters of the word and it fills in the rest, based on what you filled in earlier in that field. Another thing you can do is skip the field as you're entering the records, then fill in a block of them later, through cutting and pasting or a simple replicate command. Yet another option is using temporary abbreviations - LA for Los Angeles, for example, or Chi for Chicago. When you're all finished, it's easy to have the computer expand these abbreviations to the full word.
Contributor's occupation/employer. The federal government requires that this information be listed on all contributions of $200 or more. Many states also require it, though the dollar threshold for disclosing it varies. Of all the fields in your database, this one is probably the most important. It will be the basis of your calculations on who the biggest contributors are, and which industries give most heavily. This is also the field you'll be concentrating on when the time comes to assign category codes to each contribution.
TIPS WHEN ENTERING EMPLOYER/OCCUPATION NAMES
Abbreviate whenever possible, and be consistent. Use "Inc" and "Corp" and "Co" and don't use periods.
Replace "and" with "&" as in "Jones & Day" or "Ferrari & Sons Construction."
Law firms pose a special challenge, as they usually consist of a string of names, as in "Akin, Gump, Strauss, Hauer & Feld." The rule of thumb we use at OpenSecrets, and one we recommend, is including the full name of the firm only if there are three or fewer names in it. For anything longer, use the first two names and "et al" - as in "Akin, Gump et al." It's shorter that way, and it's also more consistent, as law firms have a way of changing their names as partners come and go. (The first couple of names in the law firm usually stay the same, but the latter names often vary through the years.)
Be consistent in how you treat names of companies that begin with initials. In general, it's best not to use spaces between the initials. Use "EF Hutton, " for example, not "E F Hutton" or "AT&T," not "A T & T." But whatever you do, don't mix and match the styles, or your records won't match up when they're sorted later. And again, save keystrokes and don't use periods.
Date of the contribution. Most database programs allow you to easily format a date field so you need to type only a few characters of the date, not the whole thing. Since most of the contributions will at least be from the same year, you can use these formatting features, type something like 0512 and have the computer fill out the date automatically as "5/12/94." Again, the important thing is to eliminate keystrokes wherever possible.
Amount of the contribution. This is a numeric field, formatted in dollars. Don't bother with cents at all - just enter $500 for a contribution of that amount, not $500.00. If you come across any contributions for odd amounts, like $259.95, round it off to $260.
Because many of the fields will be repeating themselves in a given series of records - the same city or state or candidate, for example - it makes a lot of sense to set up your computer screen in a row-and-column spreadsheet-type format, rather than as individual records. You could even use a spreadsheet program to enter the data, then transfer it later into a database.
One final word on entering data. The temptation, after youve entered your last record, is to get on to the next step (or to turn off the computer and go home). But your work is not quite finished. This is the time to go back and proof your work, comparing the computer records with the paper records. Your accuracy will be better (and your eyes will be healthier) if you print out your records rather than scanning them quickly on the computer screen. If the paper reports have subtotals on every page, recheck your own totals to make sure they match.
WHO DOES THE DIRTY WORK?
Date of the contribution. Most database programs allow you to easily format a date field so you need to type only a few characters of the date, not the whole thing. Since most of the contributions will at least be from the same year, you can use these formatting features, type something like 0512 and have the computer fill out the date automatically as "5/12/94." Again, the important thing is to eliminate keystrokes wherever possible.
Amount of the contribution. This is a numeric field, formatted in dollars. Don't bother with cents at all - just enter $500 for a contribution of that amount, not $500.00. If you come across any contributions for odd amounts, like $259.95, round it off to $260.
Because many of the fields will be repeating themselves in a given series of records - the same city or state or candidate, for example - it makes a lot of sense to set up your computer screen in a row-and-column spreadsheet-type format, rather than as individual records. You could even use a spreadsheet program to enter the data, then transfer it later into a database.
One final word on entering data. The temptation, after youve entered your last record, is to get on to the next step (or to turn off the computer and go home). But your work is not quite finished. This is the time to go back and proof your work, comparing the computer records with the paper records. Your accuracy will be better (and your eyes will be healthier) if you print out your records rather than scanning them quickly on the computer screen. If the paper reports have subtotals on every page, recheck your own totals to make sure they match.
Long hours of data entry is no one's idea of a good time, but it's a necessary first step in computerizing campaign finance data. Who should do it? The reporter who's organizing the project? Temp workers? Student interns? The choice will likely depend on budgetary factors - both financial and timewise.
As long as you carefully check the records once they're in, there is no reason not to let someone else help you input the data. If you're fortunate enough to be able to hire temporary employees, terrific. If you're able to round up a few volunteers from around the newsroom - other reporters or interns -that's fine too. Just be sure everyone is using the same stylistic conventions, the same abbreviations, and the same penchant for detail and accuracy.
Whatever the arrangement, the one recommendation I would have is that the reporter who is doing the main work should be one of the people inputting the data. If you can find someone to help you, great. But even if you do get help, it's important to get your hands dirty in entering data yourself. The most important reason is osmosis. You simply pick things up - trends, names that keep repeating, oddities that bear further investigation - subtle things that tell you something is going on that looks a little suspect. The other thing hands-on inputting does is give you a sense of what everybody else is doing. It's tough to supervise someone on a job you've never really done yourself. Be a participant, even if you do have the luxury of supervising a team of inputters rather than doing it all yourself.
The ideal situation for a news organization tackling the job of trying to computerize, say, the campaign finance records of an entire state legislature, would be to form a strategic alliance with another organization, such as a local university or university-sponsored research organization. This is a project that would make an ideal classroom project in political science, journalism, or both. It would help bring the real world of politics into the theoretical world of the classroom, and it would provide an education for all involved. It would also provide enough extra help for news organizations that it could make the difference in actually convincing your editors or publisher to undertake a major project.
As long as the work is supervised, as long as accuracy and consistency can be insured, it doesn't really matter who puts the paper records into the computer. It only matters that it gets done, because once those records are in electronic format, the real fun begins.
Standardizing the Data
To make computers do what they do best, inconsistencies in real-world data need to be smoothed out. That's the situation you're facing when dealing with thousands of campaign contribution records that have been filled in by hand by dozens of campaign treasurers and aides. Names of contributors, and the companies they work for, will have almost endless variations. What you need to do once you've got the raw data in your computer is to standardize the names. Is John H. Jeffords the same as J. H. Jeffords or Jack Jeffords? For that matter, was John T. Jeffords a different person, or just a misprint? You can nearly always tell by checking the wider context.
Before you start standardizing, though, you've got to add two new fields to your database - a contributor ID and a "newemploy" field:
| Data | Field name | Length | Field type |
| Contributor ID | ContribID | 9 |
Character |
| Occupation/employer | Newemploy | 40 |
Character |
The contributor ID has a length of nine characters. The Federal Election Commission uses a nine-digit code to identify candidates and PACs, so your database will be able to accommodate those codes directly if you use nine characters too. You probably will never need that many characters in a database under a million records, but the space can actually come in handy. You can have the computer generate a sequence of numbers, but my own advice would be to generate the ID based on two factors - the first three letters of the contributor's last name and then a five-digit sequential number. Don't worry if the numerical part of the ID isn't perfectly sorted alphabetically. It doesn't have to be; in the beginning, it only needs to be unique for each contribution. As you go through the contributor names alphabetically, finding multiple contributions from the same person, you'll replace all those unique IDs with a single ID for each contributor.
To do that, you've first got to sort all the contributor names alphabetically. Line 'em up, Aaron through Zuchelli. Last name, followed by a comma, followed by the first name, as in "Maloney, Richard B." This is the simplest way to find identical, or nearly identical, names. When you find two names that are identical - or close matches with the same address - give them the same ID number. Here's how it works:
Original list:
| Contributor | Amount | Date | Candidate | ContribID |
| Jones, Henrietta | $250 | 4/12/94 | Calhoun | JON21929 |
| Jones, Henrietta | $500 | 9/4/94 | Wilson | JON21930 |
| Jones, Henrietta | $500 | 9/14/94 | Emerson | JON39321 |
| Jones, Henrietta | $250 | 11/1/94 | Emerson | JON40032 |
Since the contributor is the same in each case, you take the first ID number and copy it to all the other contributions made by the same person:
Revised list:
| Contributor | Amount | Date | Candidate | ContribID |
| Jones, Henrietta | $250 | 4/12/94 | Calhoun | JON21929 |
| Jones, Henrietta | $500 | 9/4/94 | Wilson | JON21929 |
| Jones, Henrietta | $500 | 9/14/94 | Emerson | JON21929 |
| Jones, Henrietta | $250 | 11/1/94 | Emerson | JON21929 |
It doesn't really matter which of the original contributor ID's you choose to duplicate. All that matters is that every contribution from Henrietta Jones is identified with the same ID.
You can skip the letters and use eight numbers if you want, but you do need to limit the ID to eight characters, because you're going to be adding a ninth character to some contributors later. That ninth character will be used to designate non-income-earning family members after you've connected them to the person in the household who brings home the bacon.
If Henrietta Jones, from the example above, had a non-income-earning husband named Clyde, his ID would become JON21929A. If Henrietta and Clyde had a 12-year-old daughter named Eunice, her ID would be JON21929B. This coding system also makes it easy to keep up with the Joneses if they multiply. Any other children who come along later (and give contributions) can be added simply by adding a C, D, E, etc. to the original eight-digit ID. And all would be linked to Henrietta, since she's the one in the family who earns the income.
There are a number of reasons why a contributor ID makes sense, but its biggest benefit is that it allows you to link related (and identical) contributors regardless of any variations in the name you find on the contribution reports. Henrietta Jones may show up in your database with any number of variations in her name - whether due to nicknames, typos, or any other reason. The ID number tells you these are all the same person, while preserving the data in the name field as it was entered in the original records.
Henrietta's case was an easy one. But as you sort through the records, you will inevitably come across contributors who may or may not be the same person. Here's an example:
| Name | Occupation | Zip | ContribID |
| Wilson, Harold G | Buzzell & Jones | 60611 | WIL00393 |
| Wilson, Harold G Jr | Buzzell & Jones | 60611 | WIL00394 |
| Wilson, Harold G Sr | Retired | 60453 | WIL00395 |
| Wilson, Harry | 60453 | WIL00396 |
How many Harold Wilsons have we got here? Two at least, since Jr. and Sr. are clearly marked. But what about Harold G or plain old Harry? You can't tell from the name alone, but you usually can tell from the address, occupation, or occasionally from other fields. The more fields you can view in context, the easier it will be. (That's why it makes sense to set up your computer screen in a row-and-column format. Squeeze as many columns as you can on your screen to make comparisons easier.) If you still can't tell who's who from the data you have, add a coded letter to the ID - an "X" would be appropriate - to indicate that this may be connected with another contributor, but you can't confirm it yet. In the case of the Messrs. Wilson above, you would handle the ID's this way:
| Name | Occupation | Zip | ContribID |
| Wilson, Harold G | Buzzell & Jones | 60611 | WIL00393 |
| Wilson, Harold G Jr | Buzzell & Jones | 60611 | WIL00393 |
| Wilson, Harold G Sr | Retired | 60453 | WIL00395 |
| Wilson, Harry | 60453 | WIL00395X |
Harold G. and Harold Jr. both have the same zip code and employer, so they're almost certrainly identical. Harold Sr. lives in the same zip code as Harry, but you couldn't tell from this information alone whether he's the same as Harry Wilson. Only with the street address could you confirm it.
STANDARDIZING EMPLOYERS AND OCCUPATIONS
Contributor names aren't the only fields in need of standardization. You'll also need to clean up the names of contributors' employers and occupations. When you start assigning category codes to each contribution, youÕll use the contributors' occupations/employers to determine their financial interests. You'll also use the occupation/employer information to generate lists of the leading contributors - but to get accurate totals, you'll first have to standardize the employer names.
As with individuals, the best way to do this is to sort the employer field alphabetically. If the records you're working with have information on the contributors' occupation and employers, you'll want to preserve that original data in the occupation and employer fields. To do that, and to store the new standardized company names, you use a new field - newemploy. What goes in the newemploy field? If you know the contributor's employer and his or her occupation, enter the employer's name - duly standardized - in newemploy. If you have only the occupation or the employer, put whichever one you have in the newemploy field. And if you have no information at all about the contributor's occupation or employer, leave newemploy empty. Here's how it works:
| Name | Occupation | Employer | Newemploy |
| Cossett, Miles | |||
| Wilson, Harry | Retired | Retired | |
| Farquard, Harold | Accountant | Farquard & Doe | Farquard & Doe |
| Barnwell, Linda | Accountant | Self | Accountant |
| Finley, Peter | Attorney | Attorney | |
| Obote, Milton | Lawyer | Smith & Jones | Smith & Jones |
| McAuley, Alex | Attorney | Smith and Jones | Smith & Jones |
| Chat, Felix | Tasty Top | Tasty Top Bakery | |
| Gateau, Bernard | Executive | Tasty Top Bakery | Tasty Top Bakery |
Once you've sorted the employer field alphabetically, do the same with the occupation field. As you do so, you will undoubtedly come across common occupations that don't line up alphabetically, but that are equivalent. Standardize those occupations under a consistent name in the newemploy field. Examples:
| Attorney, Lawyer, Law | -> | Attorney |
| Physician, Doctor, Medical doctor | -> | Physician |
| Homemaker, Housewife, Domestic engineer | -> | Homemaker |
| Real estate, RE, Real estate sales | -> | Real Estate |
| Insurance, Insurance agent | -> | Insurance Agent |
| Car dealer, auto dealer | -> | Car dealer |
| Educator, teacher | -> | Teacher |
| Accountant, CPA | -> | Accountant |
"Generic" occupations like these are not as critical as company names, since you won't be compiling them in your list of top contributors. They are handy, however, and there's nothing to be gained by having one total for "physicians," for example, and another for "doctors." On the other hand, don't sacrifice specificity for convenience. For example, don't convert "Real estate developer" into "Real estate" - or "Cardiologist" into "Physician." You might want to draw the distinction between each of those groups later, as particular legislative issues arise.
Once you've gone as far as you can in standardizing occupations/employers, and assigning IDs to each distinct contributor, it's time to standardize the newemploy field for each contributor with multiple IDs. One of the things you'll find as you go through the lists of contributions is that different candidates report different occupations for the same people. The more contributions a person makes, the more likely you are to turn up variations on what they do for a living. Here's an example:
| Name | Amount | Candidate | Occupation/Employer |
| Barnovski, Victor | $500 | Jones | Attorney |
| Barnovski, Victor | $250 | Fritz | Self-employed |
| Barnovski, Victor | $250 | Alexander | Barnovski & Schwartz |
| Barnovski, Victor | $500 | Eddington | Consultant |
| Barnovski, Victor | $250 | Montez | |
| Barnovski, Victor | $1,000 | Milton | Lawyer |
| Barnovski, Victor | $1,000 | Montoya | Barnovski & Schwartz |
If Mr. Barnovski had given only to candidate Montez, you'd never know what he did. Likewise with candidate Fritz. Fortunately, Barnovski spread enough money around that somebody finally got it right. From the information you can glean from all the reports, it appears that Barnovski is a partner in the law firm of Barnovski & Schwartz. This kind of variation in reported occupations/employers is commonplace - particularly among big givers who may in fact have several business interests. Often, the problem is not with the candidates, but with the lack of candor by people like Mr. Barnovski, who might have failed to fully identify himself (at least on paper) when giving his contribution. Be particularly wary of contributors from the Washington, D.C. area who list their occupation as "consultant." Many turn out to be lobbyists, but you won't be able to confirm that unless you consult a lobbyist directory, or find his name elsewhere in your database with a more accurate description of his livelihood.
Fingerprinting Contributors
The term "fingerprinting" contributors means expanding the information you have on each contributor to include their occupation and employer, identifying other non-income-earning family members who may have contributed, and checking to see whether they've made any ideologically-based contributions to a political action committee.
In states where the occupation/employer is not provided, the job of discovering employers will be the most challenging step in the fingerprinting process. There are techniques to discover this information, and they'll be discussed later in this section. Even if the occupation/employer is listed, there's still a bit of work to do, as you'll uncover plenty of inconsistencies in the ways occupations are listed by different candidates.
The first step in the fingerprinting process, though, concentrates on finding other family members - spouses and children - who may have buttressed the family breadwinner's contributions with gifts of their own.
IDENTIFYING SPOUSES AND CHILDREN
The easiest way for a wealthy contributor to give more than the nominal contribution limit to the candidate of their choice is to give an extra contribution through their spouse. The number of contributors who do this is very large - so large, in fact, that the single biggest occupation listed on the rolls of the Federal Election Commission in a typical election cycle is "homemaker" or "housewife" or some similar variation.
Since most "homemakers" have no additional sources of income aside from their income-earning spouse, what you need to do is determine the family breadwinner's occupation/employer and assign that same occupation to all other family members who don't have independent incomes. In other words, if you identify William J. Harris as the president of First National Bank, and then you identify Rebecca Harris as his spouse, and Becky and Bill Jr. as his children, you assign the income earner's occupation/employer to all family members.
| Name | Employer | Newemploy |
| Harris, William J. | First National Bank | First National Bank |
| Harris, Rebecca | Homemaker | First National Bank |
| Harris, Becky | Student | First National Bank |
| Harris, William J. Jr. | Student | First National Bank |
Since you're doing this, you have to be sure when publishing the information, and compiling lists of top contributors, that you specify that your totals from each organization (in this case, First National Bank), include those of employees, officers and immediate family members. This is a legitimate way to account for the contributions, since it accurately tracks the economic interests of the contributors, but you have to be clear about your methodology.
The way to identify the spouses in the first place is to sort through the records so the husbands and wives (and any other family members) line up next to each other. The best way to do this is to sort by the last name, then by the zip code, like this:
| Name | Employer | Address | Zip |
| Harris, Pamela | Homemaker | 75 Cushman Place | 85011 |
| Harris, William J. | Accountant | 75 Cushman Place | 85011 |
| Harris, Joe | Retired | 88 Hazelnut St | 85023 |
| Harris, Loretta | Retired | 88 Hazelnut St | 85023 |
| Harris, Alexander | Consultant | 8381 Yucca Dr | 85023 |
By sorting the records in this way, most of the husband-wife combinations will be fairly obvious. Clearly, though, you're not going to match everyone this way. The most difficult case is when the two spouses list different addresses - the husband listing his office address and the wife listing the home address. Likewise, if the wife keeps her maiden name, she won't match up with her husband using this sorting method. One thing to watch for are hyphenated last names, an increasingly popular phenomenon on contribution rolls. If Juliet Wilson-Jones lists her occupation as homemaker, but her spouse doesn't turn up under W for Wilson, search J for Jones and you may find him.
You're never going to match every homemaker with his or her income-earning spouse, but you should be able to match the majority of them. If you've got a lot of "homemakers" left over with no apparent mates, sort the database again by the street address and try again.
However you match husbands and wives (and their children), once you've matched them, you've got to do two things: assign them the same contributor ID as the family breadwinner, plus an extra letter denoting that they're family members. At OpenSecrets, we add an A for spouse and B, C, D, E, F, etc. for children. So if the breadwinner's ID number is 39384, the spouse is 39384A, and their two "student" children are 39384B and 39384C.
Once you've copied the new ID number, you can also replicate the newemploy field of the family breadwinner. If it's First National Bank, copy that down to the spouse and all the children. Remember, this applies only to family members who do not have an independent income. If Frank Miller lists his occupation as attorney and Lynn Miller, his wife, lists hers as psychologist, don't change either one of their occupations. He gets coded as a lawyer; she gets coded as a psychologist. You only copy down the breadwinner's occupation for members of the immediate family who do not have incomes of their own.
UNCOVERING OCCUPATIONS AND EMPLOYERS
It's one thing if your primary job is going through thousands of records standardizing occupations/employers and attaching them to spouses and children. It is quite another thing if your database covers a state where contributors don't report their occupations and employers and you've got to find them out on your own. In that case, you're going to need all the help you can get fingerprinting individual contributors. (You can use the same help if the contributor hasn't disclosed his occupation/employer despite state law.)
If you're dealing with major contributors - someone whose name keeps reappearing on multiple records - you might start by circulating the list of contributors to other reporters around the newsroom. Start with the political or statehouse reporters; they're the ones most likely to know if the giver is a lobbyist. It also helps to circulate "unknown contributor" lists to your news organization's remote bureaus around the state or region. Even if you don't recognize the name from your base in Chicago, the stringer in Peoria might very well recognize contributor names in that area.
If you're fortunate enough to be looking at an area that is covered by Polk or Johnson City Directories - those big, thick phonebook-like volumes with reverse address and phone listings - dig them out and start looking. Both Polk and Johnson City Directories include city residents' employers. To find out whether each of the directory publishers covers cities in your area, you can phone them. RL Polk & Co's customer service number is 804-353-0361. The Johnson Directories are no longer being published, but many libraries still have old copies.
Professionals are often listed in directories of their own, and the first one you should head to the library to dig up is the Martindale-Hubbell Law Directory. This multi-volume set is arranged by state and city, then alphabetically by lawyers and law firms. Considering the regularity with which lawyers contribute to political candidates (they gave $44 million to federal candidates in 1992), this is one book you might want to check your whole database against. Martindale-Hubbell also publishes a CD-ROM version of its directory - a far easier way to search for names than the hefty volumes, if you can find it at your library, or afford to buy a copy for the newsroom.
A similar professional volume, the American Medical Directory, lists all the nation's physicians. Published by the American Medical Association, it lists them alphabetically along with their city of residence. Your "hit rate" won't be nearly as good with this book as with Martindale-Hubbell, but it's an excellent reference if you're trying to determine whether "Dr. Milo Cohen" is a physician or a PhD.
Other obvious sources, if you've got the time, are plain old telephone directories - or new-fangled CD-ROM directories that cover white pages listings for the entire nation. Your best bet here is PhoneDisc (1-800-284-8353), which publishes four different CD-ROMs, ranging in price from $79 to $249 for more than 81 million residential phone listings and nine million businesses. It's not a complete directory - not only are unlisted numbers unlisted here, but neither are listings of people who withhold their home address. Nevertheless, the disks are an incredible bargain (they're widely available and heavily discounted). They're also a great way to track down old school chums you've lost track of over the years! Highly recommended.
IDEOLOGICAL CONTRIBUTORS
It's dangerous to make assumptions about why people give money to politicians, and it's best not even to try. By using a person's employer as a means for coding their contribution you're making no assumptions at all - simply reporting on the contributor's source of income. But some contributors clearly give not because of their occupation or employer, but for ideological reasons. There is a way to identify these ideological contributors without inferring unknowable motives, but it involves an extra bit of work.
The way to identify ideological contributors is to examine the lists of contributions to ideological PACs. If Willie Johnson, for instance, gives $500 to the National Right-to-Life PAC, then gives $250 to a candidate who's supported by Right-to-Life, that contribution can rightly be categorized as Pro-Life, whatever Willie's occupation. OpenSecrets uses this technique when classifying contributions to federal candidates.
It is a very conservative approach, but it's one that can clearly be supported from the public record. In order to be classified as an ideological contribution, two elements must be satisfied:
The contributor must have contributed to an ideological PAC, and
The candidate must have received funds from an ideological group with the same interest.
If either criterion is missing, Willie's contribution will be based on his occupation. For example, if Willie Johnson gave another $200 to a candidate who received no money from pro-life groups, his contribution wouldn't be counted as ideological. Granted, these criteria are so conservative they undoubtedly undercount ideological contribution totals. But it's all you can say for sure based solely on the public record.
Categorizing Contributors
The final stage of the fingerprinting process is the real payoff. In this phase of the research, you'll be assigning category codes to each contribution. The codes will correspond to the contributor's specific industry or interest group. When this phase is done, you'll have completed the database. All that will remain is reviewing it and finding whatever patterns stand out.
Though this is a rewarding step in the process, it's a challenging one. Now that you've determined that Winifred Wyzinski, for example, works for Wyzinski & Associates you've got to figure out what in the world Wyzinski & Associates does. Is it a lobbying firm? A trucking company? A management consulting firm? There's no telling from the name, so you'll have to find out elsewhere.
Fortunately, you can learn a surprising amount about what different companies do right on the shelves of your local library. Tens of thousands of corporations are listed and described - and their officers named - in publications such as Standard & Poor's Register of Corporations, Directors and Executives and Dun & Bradstreet's Million Dollar Directory. Doctors, lawyers, and many other professionals can be found in professional directories that you'll also find on the shelves of a well-stocked library. And if the companies are local, you can always phone them up.
The categorization process applies not only to individual contributors' employers. You'll be using the same categories for all classes of contributors - PACs, individuals, corporations and labor unions. For each one, you'll be trying to figure out which industry or interest group best describes what the contributor does, or what it stands for. Before reviewing the research materials that will help you categorize the contributors, an explanation of the categories themselves is in order.
The need for a system of categories is obvious, as soon as you start compiling contributions into a database. It is certainly useful to know who the biggest contributors are, and how much particular unions, companies and PACs are giving to political candidates. But what is even more important (and revealing) is figuring out how much whole industries are giving.
OpenSecrets' coding system had its roots in Alaska, where it was originally designed to match the patterns of political money going to members of the state legislature. That original system has undergone countless revisions over the years, along with a major expansion when the categories were applied to Congress. The system is still evolving; with each new election cycle we still tweak one or two categories, based on recent shifts in contribution patterns.
The coding system is hierarchical. At the very highest level, there are five super-categories: Business, Labor, Ideological/Single-Issue, Other and Unknown. Below that top level there are 13 "sectors," about 100 "industries" and in all, some 400 categories. A full list of all the categories is included here, but for the moment, here's how the sectors and industries break down. (The most detailed "category" level has been omitted here to save space).
- Agriculture
- Crop Production & Basic Processing
- Tobacco
- Dairy
- Poultry & Eggs
- Livestock
- Agricultural Services/Products
- Food Processing & Sales
- Forestry & Forest Products
- Miscellaneous Agriculture - Communications & Electronics
- Printing & Publishing
- Media/Entertainment
- Telephone Utilities
- Telecom Services & Equipment
- Electronics Manufacturing & Services
- Computer Equipment & Services - Construction
- General Contractors
- Home Builders
- Special Trade Contractors
- Construction Services
- Building Materials & Equipment - Defense
- Defense Aerospace
- Defense Electronics
- Miscellaneous Defense - Energy & Natural Resources
- Oil & Gas
- Mining
- Electric Utilities
- Environmental Services/Equipment
- Waste Management
- Fisheries & Wildlife
- Miscellaneous Energy - Finance, Insurance & Real Estate
- Commercial Banks
- Savings & Loans
- Credit Unions
- Finance/Credit Companies
- Securities & Investment
- Insurance
- Real Estate
- Accountants
- Miscellaneous Finance - Health
- Health Professionals
- Hospitals/Nursing Homes
- Health Services
- Pharmaceuticals/Health Products
- Miscellaneous Health - Lawyers & Lobbyists
- Lawyers/Law Firms
- Lobbyists/Public Relations - Transportation
- Air Transport
- Automotive
- Trucking
- Railroads
- Sea Transport
- Miscellaneous Transport - Miscellaneous Business
- Business Associations
- Food & Beverage
- Beer, Wine & Liquor
- Retail Sales
- Miscellaneous Services
- Business Services
- Recreation/Live Entertainment
- Casinos/Gambling
- Lodging/Tourism
- Chemical & Related Manufacturing
- Steel Production
- Misc Manufacturing & Distributing
- Textiles
- Miscellaneous Business - Labor
- Building Trade Unions
- Industrial Unions
- Transportation Unions
- Public Sector Unions
- Miscellaneous Unions - Ideological/Single-Issue
- Republican/Conservative
- Democratic/Liberal
- Leadership PACs
- Foreign & Defense Policy
- Pro-Israel
- Abortion Policy
- Gun Rights/Gun Control
- Women's Issues
- Human Rights
- Miscellaneous Issues - Other
- Non-Profit Institutions
- Civil Servants/Public Officials
- Education
- Retired
- Other - Unknown
- Homemakers/Non-income earners
- No Employer Listed or Found
- Generic Occupation/Category Unknown
- Engineers, unclassified
- Employer Listed/Category Unknown
- Unknown
Each category has its own five-character code, which is entered in the computer as the category is learned. The first character is a letter and generally corresponds to the sector - A for Agriculture, H for Health, etc. The other characters are numbers, which are also arranged hierarchically. As an example, Energy Production & Distribution is E1000, the Oil & Gas industry is E1100, Gasoline Service Stations are E1170.
Besides its own code, each category is also linked to higher "industry" and "sector" codes. So when you enter E1170 for Milo's Texaco, his contributions will be included under the Oil & Gas industry, and the Energy & Natural Resources sector.
SIC CODES
The category codes for business types are based, loosely, on a system of business classifications developed by the U.S. Government's Office of Management and Budget. That system is known as the Standard Industrial Classification, or SIC code. The basic SIC code list is a set of four-digit numbers that covers virtually every conceivable type of business, from "abrasive products" to "yarn texturizing, throwing, twisting and winding mills." The government uses these SIC codes to classify the millions of different businesses that operate in the U.S. More importantly, this standard government code has also been picked up by corporate directories, such as those put out by Standard & Poor's and Dun & Bradstreet. If you look up a company in Standard & Poor's' Register of Corporations (which you'll find in the business reference section of any moderately sized library), you'll find the name of the company, its top corporate officers, and a full listing of any SIC codes that describe the company's lines of business.
SIC codes based on Yellow Pages listings have also been used by a number of CD-ROM publishers to classify businesses. Look up Bank of America on the PhoneDisc Business CD-ROM, for instance, and you'll find it identified under the SIC code of 6021, National Commercial Banks - because that's the Yellow Pages category under which its ad was listed. SIC codes are also used by several other invaluable CD-ROMs, including Dun's Business Locator, which uses the codes to identify over 9.2 million businesses across the country. (More on reference materials below.)
Electronic copies of OpenSecrets' category system are available from OpenSecrets at nominal cost. Phone us at 202-857-0044 for details.
It's possible that OpenSecrets' category system - which is designed primarily for congressional candidates, and is arranged to coincide with the jurisdictions of congressional committees - will need to be amended to fit your local circumstances. Feel free to amend it as needed, the system here is offered as a starting point. If you can keep fairly close to the coding system, however, it will help, if later you want to exchange databases with someone from another state, or to supplement OpenSecrets' coding of federal candidates with your own coding of state and local candidates.
ASSIGNING CATEGORIES TO CONTRIBUTORS
There are any number of techniques for fingerprinting PACs, companies, labor unions, trade associations, and other contributors. For individuals, you'll be using the contributor's occupation/employer to determine the category code (unless you've identified them as an ideological contributor, as explained in the previous chapter).
It's often easiest to start with the PACs. First of all there are fewer of them, and they represent a large proportion of the campaign dollars at both the federal and state levels. At the federal level, PACs (or "political committees," as they're officially designated) don't have to declare to the Federal Election Commission what their agenda is. The only thing a PAC like "Citizens for Better Government" has to disclose is its name, address and treasurer. But if a PAC is sponsored by a corporation, labor union, trade association, or other organization, it must list its sponsoring group with the FEC.
The Realtors PAC, for example, is sponsored by the National Association of Realtors. If you're trying to categorize a state or local PAC that is affiliated with a federal PAC, your best bet may be to contact OpenSecrets and find out how we've coded it.
Most corporate and union PACs - even if your state doesn't have a counterpart to the FEC's "sponsor" - are relatively easy to identify simply from their name. The GTE Corporation Good Government Club represents GTE. The American Medical Association PAC represents the AMA. More problematic are ideological and single-issue groups, most of whose national PACs do not have sponsors. Americans for Good Government, for example, is a pro-Israel PAC, as is San Franciscans for Good Government and Citizens Concerned for the National Interest. Campaign America is a so-called "leadership" PAC sponsored by Senator Bob Dole of Kansas. Wish List is a PAC concerned with women's issues. The only way you're going to identify PACs with generic names like those is to ask them, or look them up if your state requires PACs to state their political agendas.
A good source for identifying federal PACs is the Almanac of Federal PACs by Edward Zuckerman (Amward Publishing). Updated biennially, it profiles all PACs that gave $50,000 or more to federal candidates and identifies their business or ideological interests.
Another good source is our book Open Secrets: The Encyclopedia of Congressional Money & Politics. The book identifies the primary interest of every PAC that gave $20,000 or more in the 1992 elections.
Before you can begin entering the codes, you've got to have a field in your database to hold them. If you haven't done it already, now is the time to add a five-character "catcode" field to hold the category code, and a second "source" field (10 characters or less in length) that you'll use to record how you identified the code.
| Data | Field name | Length | Field type |
| Category code | Catcode | 5 | Character |
| Source | Source | 5 or 10 | Character |
or example, you look up Harold Farquard in the Martindale-Hubbell Law Directory and find that he's a lawyer for Smith, Farquard & Fritz of Seattle. Here's how you fill out the database:
| Name | Newemploy | Catcode | Source |
| Farquard, Harold | Smith, Farquard & Fritz | K1000 | MartHubb |
You can figure out many of the category codes simply by looking at the name of the contributor. If you see a contribution from the AT&T PAC, or from an employee or officer of AT&T, you simply look up the code for long distance telephone carrier - C4200. Since AT&T is a well-known company, and you know what business they're in, you can safely apply its code and put the source down as "Name." The same would be true of contributions from the American Medical Association, the National Rifle Association, or any other high-profile group.
Many other contributors can be identified by name even if you never heard of them before, simply because the type of business is evident from the name. Fred's Texaco is clearly a gas station, E1170. Betty's Beauty Salon is a beauty parlor, G5100. Mercy Hospital is a hospital, H2100. Gibson Pharmaceuticals makes drugs (H4300), Bandon Ford-Mercury sells cars (T2300), Main Street Savings & Loan is an S&L (F1200). Contributions from any of these can safely be categorized simply from the name.
When you're beginning the categorization process, a useful way to proceed is to search for certain key words in the "newemploy" field (or in the "contributor name" field if you're reviewing corporate contributions). Isolate, for example, every contribution with the word "Hospital" - or better, "Hosp", since you'll find plenty of abbreviations in the reports. Once you've got all the "hosps" on your computer screen, you can eyeball the records, make sure they're all hospitals, then mark them with the appropriate code. Never let the computer do this step automatically. The search of "hosp" may also turn up entries like "Ray's Animal Hospital," which should be classified under veterinarians, or "Vanessa's Hospitality Service" which might require further investigation. For that reason, even when you use the computer to search key phrases, always review each name by hand, or you're asking for trouble. And then there are always the bedeviling exceptions of companies with misleading names - Rhode Island Hospital Trust, for example, which is not a hospital at all, but a commercial bank. If you're at all in doubt, don't fill in the code based on the name alone. You can always look up the company later.
To help you grind through the lists of companies in your database, here's a partial listing of keywords that can help you identify the type of business. Again, don't automatically assume these codes follow, but search the keywords and go through each list one by one.
| Keyword | Type of business | Code |
| Hosp | Hospital | H2100 |
| Real, RE, R E | Real estate | F4200 |
| Nursing | Nursing home | H2200 |
| Sav, S&L | Savings & loan | F1200 |
| National Bank | Commercial banks | F1100 |
| Natl Bank | Commercial banks | F1100 |
| Ford, Olds, Buick, etc | Car dealers | T2300 |
| Toyota, Honda, etc | Japanese import dealers | T2310 |
| Truck | Trucking company | T3100 |
| ISD, USD | Public school district | X3500 |
Some codes can be applied based on the name of the contributor. Melvin G. Hobbes, MD is a physician, H1100. Rodney Jones, Esq. is a lawyer, K1000. A handful of other initials attached to names can also tell you what the contributor does for a living. Here's a short list.
| Abbreviation | Occupation | Code |
| MD | Physician | H1100 |
| DDS | Dentist | H1400 |
| Esq | Attorney | K1000 |
| CLU | Life insurance agent | F3300 |
| OD | Physician (osteopath) | H1100 |
| DVM | Veterinarian | A4110 |
| CPA | Accountant | F5100 |
| The Hon. | Public official | X3100 |
| Rev | Clergy | X0000 |
By the way, do not assume that "Dr." before a contributor's name means they're a physician. Dr. Henry Kissinger is not, nor are most people with PhD's. On the other hand, if you've got a number of otherwise unidentified contributors who list themselves as "Dr" you might want to check their names against a directory of physicians.
REFERENCE MATERIALS FOR IDENTIFYING CONTRIBUTORS
Fortunately for journalists who are trying to get a handle on the financial affiliations of political contributors, there is no shortage of reference materials that describe the business interests of different companies. You'll find many invaluable reference books on the shelves of your local library - look in the business reference section. Below is a rundown of some of the most valuable reference sources for identifying companies and contributors. Most are updated annually.
Standard & Poor's Register of Corporations, Directors & Executives. The biggest (and most useful) book in this three-volume set is the nearly 3,000-page listing of over 55,000 corporations. The companies are arranged alphabetically, and each listing identifies the companies' SIC codes and chief lines of business, their top executives, and in many cases their board of directors. It also provides their address and phone number, and additional data like their annual sales and number of employees. Two other smaller volumes complete the set. One lists some 70,000 corporate executives, along with their business affiliations. The other holds a number of indexes (including a very handy listing of all the SIC codes). Probably the most useful indexes are the Cross-Reference and Ultimate Parent indexes. Both of these link subsidiaries and affiliates with their corporate parents. If there's one book you should seek out above all others at the local library for help in identifying companies, this is the one. One caution, though: for the most part, these are not small companies. Ed's Towing, Farley's Bar & Grill, or Jones & Associates won't be listed. (Neither will most law firms or other professional offices, since they tend not to be corporations.)
Million Dollar Directory, published by Dun & Bradstreet Information Services. The format here is similar to Standard & Poor's. Corporate profiles are not as complete, but there are more of them - about 160,000 of them in the 1993 edition. It-s a good backup or supplement to the S&P guide.
Ward's Business Directory of U.S. Private and Public Companies, published by Gale Research. Another useful publication, with the same general format as the books above, but without names of corporate executives. The 1993 edition listed 135,000 companies.
Directory of Leading Private Companies, published by the National Register Publishing Co. This reference, similar in format to the ones above, lists only privately-held companies with annual sales of $10 million or more.
American Medical Directory, published by the American Medical Association. This book lists, or attempts to list, every physician in the U.S. The names are arranged alphabetically, and show each doctor's name and city.
Martindale-Hubbell Law Directory, is the closest thing you'll find to listing every lawyer and law firm in America. Entries in this multi-volume set are arranged by state, then city, then alphabetically by lawyer and law firm. If you can find (or buy) a copy, try to get your hands on Martindale-Hubbell's CD-ROM version of the directory, which is much easier to search than the printed version - at least if you're a fast typist. Whichever version you get, this directory is a real gold mine, since lawyers are one of the very biggest contributor groups, and they give heavily through individual contributions as opposed to PACs.
All the above directories list companies and individuals from across the nation. Your local library may also have a number of regional directories that can be quite valuable when you're looking at contributors to state and local campaigns.
Specialized volumes like the Texas Oil & Gas Directory and the Hollywood Creative Directory are extremely useful, if your database includes contributors from either of those areas or industries. Similar directories exist in every region of the country. The best thing to do is search the shelves of the biggest libraries in your area.
One geographically specialized volume is worthy of particular note, as its focus is Washington, D.C. and its subject matter includes some of the most influential contributors in the nation. Washington Representatives, published by Columbia Books, is the definitive guide to lobbyists in the nation's capital. It lists both individuals and law and lobbying firms, as well as their clients. It's also cross-indexed, so you can quickly find out who represents a particular company or organization in Washington. If you're trying to track down Washington lobbyists, this book is invaluable. Highly recommended.
The most efficient way to use these reference books, if you can't borrow a copy for the newsroom, is to head to the library with a printout of the companies you're trying to identify. Scope out the books in advance, and make sure your printout is sorted in the same order as the book.
CD-ROMS
The real frontier in directory publishing won't be found on bookshelves any more, but on computer. Many of the biggest publishers - including Standard & Poor's, Dun & Bradstreet and Martindale-Hubbell - are issuing electronic versions of their directories (often with more information than the printed version) on CD-ROM. These electronic versions, while sometimes prohibitively expensive, are an excellent reason for your newsroom to invest in a CD-ROM player if you don't already have one.
If you're unfamiliar with the technology, here's a quick description. CD-ROM stands for "compact disk - read-only memory." Beneath that humdrum nomenclature, however, lies the biggest revolution in media since the invention of the personal computer. The computer companies are hyping it to the hilt. "Multimedia" is their more glamorous buzzword. Essentially, a CD-ROM is a flat circular disk that looks identical to a music CD (it is), but instead of containing music, it contains data. Amazing quantities of data. Six hundred megabytes, to be exact - thousands and thousands of pages of text. The most popular CD-ROMs these days (like the growing number of multimedia encyclopedias) pack that little disk not with words so much as graphics and sound. The most useful ones for investigative reporters are light on the multimedia, but heavy on data. Unfortunately, many are also very expensive. If your newsroom can't afford them, hunt them down at the library.
Dun's Business Locator. A very expensive CD-ROM ($2,395 per year), but the best single source anywhere for millions of smaller companies all around the country. The most recent editions list over 9.1 million businesses. If you're trying to categorize "Jones & Associates" of New Orleans is, this is the place to find out. One caveat, however: Dun's lists an inordinate amount of businesses as "management services." Whether it's due to their interview techniques or some other reason, companies that shouldn't be classified this way often are. Aside from that one quirk, however, this disk is likely to be the most valuable one you'll find anywhere for identifying literally millions of companies that are listed nowhere else.
Standard & Poor's Corporations. This electronic edition of the classic Standard & Poor's Directory of Corporations, Directors and Executives, contains much more information than the book alone, though the price ($4,900 per year) tends to keep it in the hands of serious investors only. Much of it is more of interest to investors than to investigative reporters, but among the valuable things it does include are sections for most companies that describe more completely their different lines of business. This is most important when you're dealing with PACs sponsored by big, diversified corporations that could have multiple political interests. The CD-ROM often shows how much of the companies' revenues come from which source of their business - how much General Electric gets, for example, from its aerospace operations as opposed to its home appliance or power generation divisions. You can also search corporate executives by name, but beware - the "hit rate" on random searches in this database is pretty low. The disk is much more valuable for identifying the business interests of mid-sized to large corporations. (A less expensive version, which will consist of only the same material that's in the written edition, is in the works.)
Martindale-Hubbell Law Directory. This is an electronic version of the multi-volume print edition that lists virtually every lawyer in the nation. ($995 for a one-year subscription). Type in the name and up pops the entry that identifies the attorney's law firm, address, and a host of other information, including where they went to law school and when they graduated. Use this reference as a confirmation that a particular contributor is a lawyer, but be aware that lawyers do switch firms from time to time. If Martindale-Hubbell lists them as working for one firm, while their contribution report lists them with another firm, go with the contribution report, as it will have the latest information. This CD-ROM is also valuable for searching names of companies that sound like they're law firms. A good bet here is to search for any unidentified company names that consist solely of names (Kirkland & Ellis, for example), that include the words "et al" (Reed, Smith et al), or that have an ampersand (&) in their title - though you'll want to eliminate firms that have "& Co" or "& Son," as they tend not to be law firms.
PhoneDisc. If you have ever stumbled through the stacks of a large library poring through their collection of out-of-town telephone directories, you will thank your lucky stars for this disk. Or rather, this collection of four different sets of disks - PhoneDisc Business, which applies SIC code classifications to nine million businesses based on the category they appear under in Yellow Pages ads; PhoneDisc Residential, which offers 80 million residential phone listings from the nation's White Pages directories, Phone Disk ComboPack, business and residential listings in one package; and PhoneDisc PowerFinder, which includes business and residential listings, by phone number, address, SIC code and name on five regional disks. List prices for the packages range from $79 to $249, though they're widely discounted at mail order computer houses. Not as complete or reliable as Dun's Business Locator (since it's based on the secondary source of Yellow Pages listings), but only a tiny fraction of the price. Essentially, it's the poor man's Standard & Poor's. It's got its limitations, but what a bargain! This disk alone is reason enough to invest in a CD-ROM player for the newsroom. Highly recommended.
Street Atlas USA by DeLorme Mapping. It's a stretch to include this under campaign finance research materials, but it's such an amazing resource no newsroom should be without it. This one disk contains street maps of virtually every city, town, hamlet, highway and byway in America. You can zoom in to the town of your choice not only by typing in its name, but by typing its five-digit zip code, or even the area code and local telephone exchange, as in 202-857. Nearly every street in the country - interstate to gravel road - is in here, and nearly all of them are named. Once you've zoomed in to a particular city or town, you can type in the name of a street and the program will highlight it. In cities medium-sized and up, you can even look up addresses, like the 900 block of S. Halsted Street in Chicago! The list price is $169, but it's commonly discounted to $99 or less.
REAL-WORLD CATEGORIZATION PROBLEMS
Though it would be ideal to categorize every contribution to every politician, given the realities of the data you're dealing with, it's a practical impossibility. Even in states where filling out the occupation/employer of each contributor is required, not every candidate fills out every blank in every report. Even those that are filled out are not always possible to classify. "Self-employed," for example, tells you nothing about the contributor's source of income that you can translate into an industry or interest group category.
In OpenSecrets' research into the 1992 federal elections, we were able to categorize approximately 70 percent of the individual contributions with some meaningful category code. The ones that got away fell into five categories:
Homemakers, students and other non-income earners. These are the contributors who don't draw a salary and who haven't been linked either with an income-earning spouse or parent, or with a contribution to an ideological PAC.
Generic occupation - impossible to assign category. These are the ambiguous descriptions - "self-employed," "businessman," "entrepreneur," etc. Without more information, you can't tell how they earn their money.
Engineers, type unknown. This is a subset of the "generic occupation" problem. If someone says theysre an engineer, they could work in any number of industries - from construction to oil & gas, manufacturing, electronics, even railroads.
No employer listed or discovered. These are the blanks. No occupation has been listed on the contribution reports, and you've been unable to find the occupation through other means.
Employer listed/category unknown. Even when a contributor lists his or her employer, you're not always going to be able to find out what that company does. D&E Enterprises could be anything. If you can't find them in a business directory, and they're not in the phone book (or you haven't got time to call), you're out of luck.
Given enough time and resources (and state laws that require contributors to list their occupations and employers) it should be possible to categorize 90 percent or more of all contributions. But in the real world of budgets and deadlines and multiple responsibilities, there is never enough time nor all the resources yousd like to have at hand. You should be able to identify virtually every political action committee, as well as the great majority of corporate contributors. The challenge here is the money that comes from individuals. If you can categorize 60 or 70 percent of that you'll still be doing a job you can be proud of.
Searching for Patterns
Once your database is finished - you've categorized as many contributions as you can, and you've assigned "unknown" codes to all the rest - you're ready to begin compiling what you've got and digging up the patterns that stand out in bold relief. Just because the database is done, these patterns won't necessarily be obvious. Through the process of osmosis, you'll no doubt have picked up some interesting nuggets - an unexpectedly large concentration of funds from companies you might not have suspected, major bundling operations to specific politicians, etc. But the overall patterns of dollars going to the state legislature, for example, still need to be fished out of the data. To pull the trends out - and to find material for the stories you'll do later - there are a number of steps you can take.
Calculate totals for every category. This is a logical first step, and one that is guaranteed to show you things you'd never have known without doing all the research. To calculate category totals all you do is sort the database by the category code and have the computer generate totals. Save these totals as a separate file. What you'll have is a list like this:
| A0000 | Agriculture | $8,000 |
| A1000 | Crop production & basic processing | $83,039 |
| A1100 | Cotton | $500 |
| A1200 | Sugar cane & sugar beets | $12,300 |
| A1400 | Vegetables, fruit and tree nuts | $41,925 |
| A1500 | Wheat, corn, soybeans and cash grain | $10,050 |
| A1600 | Other commodities (incl rice, peanuts, honey) | $6,839 |
| A1300 | Tobacco & Tobacco products | $65,850 |
The next step is to aggregate these totals by industry and sector. Each category code - A1000, A1100, etc. - has an industry code associated with it. In OpenSecrets' category database (outlined here), this code is named "catorder" and it consists of one letter and two numbers. The industry called Crop Production & Basic Processing, for, example, is coded A01, and it includes not just A1000, but A1100, A1200, A1400, A1500 and A1600 as well. Note that it does not include A1300 - tobacco is classified as an industry in its own right. The A0000 code, which is a catchall category for agriculture-related contributors that you can't put anywhere else, carries a catorder of A11 (miscellaneous agriculture).
When you total up the list above, here's what you come up with:
| A01 | Crop production & basic processing | $154,653 |
| A02 | Tobacco & tobacco products | $65,850 |
| A11 | Miscellaneous agriculture | $8,000 |
Do the same for all your categories, and you'll have compiled totals for approximately 100 industry and interest groups. The overall patterns will begin falling in place now, but 100 categories is still too many to illustrate in a simple graph that will show you at a glance which are the top contributors. To do that, you'll generate totals for each of 12 main sectors - Agriculture, Construction, Health, Labor, etc.
Just as each category has an industry or "catorder" attached to it, so it also has a sector attached to it. All the categories shown in the sample above fall within the Agriculture sector. Look here to see the sectors for each category code and you'll see how they all fit into place.
When you've generated totals for each sector, you'll have a list like this (the totals here are from the 1992 federal elections):
| Agriculture | $24,892,124 |
| Communications/Electronics | $21,232,464 |
| Construction | $15,246,489 |
| Defense | $8,328,760 |
| Energy & Natural Resources | $21,341,235 |
| Finance, Insurance & Real Estate | $71,091,876 |
| Health | $31,710,239 |
| Lawyers & Lobbyists | $44,058,744 |
| Miscellaneous Business | $38,478,007 |
| Transportation | $18,989,690 |
| Labor | $43,299,597 |
| Ideological/Single-Issue | $29,331,914 |
This information is compact enough to turn into a chart. You can do this in any spreadsheet program on your computer, or in a stand-alone graphics program. DeltaGraph on the Macintosh is an outstanding program for creating charts and graphs - and your paper's graphics department probably already has a copy.
But don't wait for the final publication of your stories to create a set of charts. Graphing the summary data is an excellent idea at this stage, even though you're not ready yet to put together the final graphics for your story. Unless you possess extraordinary powers of symbolic intuition, rows and columns of numbers, commas and dollar signs are not as effective in communicating patterns as a simple bar chart. The numbers, after all, are only symbols that represent patterns in reality. The chart is a direct, visual representation of those patterns. It's much easier to grasp intuitively - an important consideration when you're poring through piles of data trying to figure out what's significant.
Even this simple graph is very revealing. We see immediately that the Finance/Insurance/Real Estate sector is the leading source of campaign funds, far ahead of any other sector. Labor and Lawyers head the second tier of top givers, and Defense stands out as the financial runt of the litter. But there's a lot more information you can also find out from the data you've already compiled. Two of the most important elements that you can examine, and chart, are the breakdown between funds that came from PACs versus individuals (or PACs versus corporations, unions, and individuals, if your state allows all those groups to contribute), and the breakdown in contributions to Democrats and Republicans.
To get this information you'll have to go back to your original database and generate new totals. The first one is based on that single-character field that listed a code for the type of contribution - I for individual, P for PAC, etc. Sort the database by contribution type, then by category, and generate new category totals for PACs, individuals, etc. Aggregate the separate categories into industries and sectors, and generate a new chart. The chart below takes the same data we looked at above, and highlights how much came from PACs versus individuals in each sector.
Immediately we have new insights. You can see, for example, that the great majority of contributions from lawyers and lobbyists comes from individual contributors, not PACs. The opposite is the case with labor donations - nearly all are delivered through political action committees. That's significant, it's worth a story, and it suggests a whole new line of questioning you can undertake when you begin doing interviews.
More revelations are in store when you break down the sector totals by party. To get to that point, though, youÕve first got to add a new field to the database - the party affiliation of each candidate. If you've already gathered this information and put it into a separate database of all candidates, it's a relatively easy matter to merge the two databases together and update your main database. All you need is a single field that's common to both databases, like a candidate ID. With most database programs, updating a field in one database with information from another is a relatively straightforward task.
When you've got the party information attached to each contribution, sort the database by party, then by category, and generate new totals. Graph them, again using the 1992 federal election data, and your new chart provides a whole set of insights:
The overwhelmingly Democratic tilt of labor unions now stands out dramatically. Lawyers and lobbyists also clearly favor Democrats by a wide margin, as do ideological and single-issue groups. Most of the other sectors split their dollars fairly evenly between the parties. The chart, and the numbers that go into it, once again suggest a whole new series of questions you'll want to pursue.
In fact, every chart you do, every summary total you generate, every way you look at the data - raises new questions and brings new insights. Some will be dramatic, others will be subtle. All bear further investigation. Looking at the data won't necessarily give you answers to all your questions, rather it will suggest the questions you ought to raise - of candidates, funders and political gurus alike.
Though the examples above deal with sector totals, you'll want to generate the same data, and many of the same exploratory charts, for every industry and category. Once you've generated totals at the category level, it's easy to aggregate the numbers into industries and sectors.
Here are some suggestions for other ways of looking through your database to find potentially interesting patterns:
Calculate totals from business vs. labor vs. ideological/single-issue groups. Labor is a sector by itself. So is Ideological/Single-Issue. Aggregate the ten business-related sectors (Agriculture to Miscellaneous Business) and you'll have the total for business contributors. When you total the numbers this way, you'll probably find the same pattern that's evident at the federal level - namely, that most contributions come from business groups. Unions and ideological groups may also be big contributors, but not when compared with the combined total of all business categories.
Find out which industries are the biggest supporters of the Democrats and Republicans. If you generate a list of the leading industries giving to candidates from each party, you will almost certainly find that some industries show up at or near the top of both parties' chief supporters. That means heavy political clout, no matter who wins at the polls - a red flag that ought to provoke a closer look at the industry's legislative agenda, and what it's received for its bipartisan investment.
Calculate the industries that are most heavily partisan. This is simply a percentage for each industry - what proportion went to Democrats versus Republicans. Labor unions, and a few ideological categories, will almost certainly top the most-partisan list for Democrats. But what other industries give heavily to them in your state? Likewise, which industries most strongly support Republican candidates?
Calculate the fastest growing and fastest declining industries. Obviously, this is one trend you won't be able to spot until you've got more than one election cycle to look at. But when you do, it will provide you with a very important bit of information. Most industries at the federal level are surprisingly consistent in their contributions from year to year. But if there's heavy political action on the horizon, look for an industry's contributions to soar. That's what happened to health care contributions in 1992, as Congress prepared to consider massive changes to the nation's health and insurance system. The National Rifle Association also greatly stepped up its giving in 1992 - a sign the NRA was trying to shore up its defenses against a rising tide of gun control legislation. If an industry - or an individual company or PAC - dramatically increases its giving from one election cycle to the next, you can be sure that something is afoot.
