What you need to know


Click on the following for definitions of the key terms you need to know:
| Database | Field | Key Field | Record | Search | Query | Sort | Database Management System | Mail Merge | Flat-file Database | Relational Database | Data Redundancy | Data Integrity | Data Mining | Data Matching | Concerns of Data-Matching & Data-Mining Click on the following links for additional information:

Database


A collection of related types of data in a single file, or set of files, for sorting, analyzing and reporting. Some database operations are browsing, database queries, information request, sorting data, printing reports, labels, and form letters, complex queries etc.

Advantages of Database:
  • Easier to store large quantities of information
  • Easier to retrieve information quickly and flexibly
  • Easy to organize and reorganize information
  • Easier to print and distribute information in a variety of ways

Field


A single element of data in a single record within a database. It is a part of a record representing an item of data. The fields within a database structure can be of different types depending on the requirements of the data to be stored for example text, numeric, date etc.

Key Field


A unique identifier for each record, a field that contains data that uniquely identifies the record. For example, a student ID number.

Record


A single entry for an entity in a database may be composed of more than one data field. A number of related items of information that are handled as a unit.

Search


Looking for a specific record, refer to query.

Query


An information request, this can be a simple search for a specific record or a request to select all records that match a set of criteria.

Sort


A command which allows you to arrange records in alphabetic or numeric order based on the values in one or more fields. This is done because it may be necessary to rearrange records to make the most efficient use of data.

Database Management System


Database management system (DBMS) is a program that sorts links and otherwise organizes and manages data in a database. DBMS may also assist in the analysis of data and the preparation of reports. It can manage the facilities for sharing the database to ensure that problems do not arise when two people try to simultaneously access a record and try to update it. Also a DBMS can also provide the ability to recover the database in the event of system failure and handle password allocation and checking. This helps restrict user access to data.

Mail Merge


This is where the contents of a database is linked to a word processor. A ‘standard letter’ (sometimes referred to as a ‘form letter’) is written which contains fieldname references where personal information would be written. The computer, as it prints, merges the database with the standard letter to produce personalized letters.

Flat-file Database


A flat-file database is basically your typical table. It is an an ASCII file containing data and usually serves as a database file. Flat-file records may be single ‘line’, or several records may occur in a line or block of data. Flat-files are less useful for high speed searches or for linking two or more sources of data. They are easily transferred between various operating systems and database managers.
  • Advantages:
    • all data is visible therefore the previous data is visible and so it can be easier to compare and analyze data
    • simple to implement
  • Disadvantages:
    • data redundancy, repeated data, can occur
    • spelling errors
    • difficult to search
    • lots of storage space
    • if there are several copies, each one has to be updated
    • data becomes unreliable
    • only one user can access the database at any time

Relational Database


A database consisting of files that can be viewed as collections of tables of rows and columns. A spreadsheet would not be a relational database. Each table row is a record of one entity. Each column represents a specific field of data, e.g. name, age, weight or height. The tables usually contain a unique identifier (Key Field) for each record (row). Data from two or more tables may be combined by matching the unique identifiers.
  • Advantages:
    • all data is in a common pool, accessible by all applications
    • data is in one place where everyone can access (if they have permission)
    • the system is much more flexible
    • much less data is needed to be held
    • easier to maintain high quality information
  • Disadvantages:
    • unproductive maintenance, where programs were still dependent on the structure of data. If a department needed to add, for example a new field, all the programs would have to be altered for all departments
    • lack of security, so that even the most sensitive information could be accessed

Data Redundancy


This is the storage of duplicate data. Data redundancy is often unnecessary but is sometimes useful or essential. To help reduce data redundancy, relational databases help reduce the unnecessary replication of data. Unique keys in each database table are used to link tables of data belonging to specific records or entries in the database. Networks help reduce the need for duplicate data by permitting the sharing of data.

Data Integrity


The entry and preservation of stored data in a manner that results in its retrieval in a form identical to the original and representing the original observations or ideas. A condition in which rules are imposed to ensure that data is more likely to be correct. These rules may include links to related database tables, non-duplicate record keys, or detailed conditions related to the type of data being stored. It is also the assurance that unintended changes are not made to data.

Data Mining


Data mining is the discovery and extraction of hidden predictive information from large databases. This is going through huge quantities of data and finding (mining) for valuable information. For example in grocery shopping, data mining can be done in order to form gender specific marketing campaigns. It uses statistical methods and artificial intelligence to locate trends and patterns that are otherwise overlooked by normal database queries. It allows the user to find valuable veins of information inside masses
of data. Data mining can be used by organizations to design effective sales campaigns, form targeted marketing plans or develop or enhance the product to increase sales.

There are three main steps of data-mining process:
1. Data is prepared for the data-mining process
2. A data-mining algorithm is sued to process the data
3. The results of the data-mining process are evaluated

Data Matching


Data matching is the large scale comparison of records or files collected for different purposes to produce new information. It can be conducted for several purposes such as detect errors and illegal behavior, locate individuals, ascertain one's eligibility etc..

Benefits:
  • Track down criminals using the information database for example The National Crime Information Center managed by FBI
  • Establish reputations
  • Financial trustworthiness

Issues:
  • Data errors are common
  • Data can become nearly immortal
  • Data isn't secure

Concerns of Data-Matching & Data-Mining

  • Privacy, such practice can reveal large quantities of previously unknown personal information about individuals
  • Can occur without the knowledge or consent of the data subject
  • Accuracy of the data derived from a data-matching or data-mining process
  • Information can be incorrect or incomplete at the time of collection or ceased to be accurate after some time
  • Difficult to inform the data subject of the exact purpose for which his or her personal information is to be collected or used
    • This is due to its sole purpose: to discover previously unknown information
  • Storage of large amounts of personal information