OCA Oracle Database 11g SQL Fundamentals I by Steve Ries Download PDF Free
OCA Oracle Database 11g SQL Fundamentals I by Steve Ries Download PDF Free
SQL and Relational
Databases
We live in a data-driven world. Think for a moment about all the data that exists
about you, in computers around the world.
• Your name
• Birth date and information
• Your hobbies
• Purchases you've made
• The identity of your friends
• Your place of employment
The examples are endless. Next, multiply that amount of data by the number
of people in the world. The result is a truly staggering amount of information.
How is it possible that all this data can be organized and retrieved? In today's
data-centric world, it is databases that make this possible. These Relational
Database Management Systems (RDBMS) are primarily controlled by a
programming language called Structured Query Language (SQL).
In this chapter, we will cover the following topics:
• Discussing the purpose of relational database management systems
• Understanding the use of the relational paradigm
• Examining the use of Entity Relationship Diagrams (ERDs)
• Looking at the structure of tables
• Introducing Structured Query Language (SQL)
• Reviewing commonly-used query tools
• Introducing the SQL Developer tool
SQL and Relational Databases
Relational Database Management
Systems
Imagine, for a moment, that you have the telephone books for the 20 largest cities
in the U.S. I give you the following request: Please find all the phone numbers for
individuals named Rick Clark in the Greater Chicago area. In order to satisfy the request,
you simply do the following:
• Open the Chicago phone book
• Scan to the "C" section of names
• Find all individuals that match "Clark, Rick"
• Report back their phone numbers
Now imagine that I take each phone book, tear out all of the pages, and throw them
into the air. I then proceed to shuffle the thousands of pages on the ground into
a completely disorganized mess. Now I repeat the same request: Please find all the
phone numbers for individuals named Rick Clark in the Greater Chicago area. How do you
think you would do that? It would be nearly impossible. The data is all there, but
it's completely disorganized. Finding the "Rick Clarks" of Chicago would involve
individually examining each page to see if it satisfied the request—a very frustrating
undertaking, to say the least.
This example underscores the importance of a database, or more accurately, a
Relational Database Management System(RDBMS) Today's RDBMSs are what
enable the storage, modification, and retrieval of massive amounts of data.
Flat file databases
When the devices that we know as computers first came into existence, they were
primarily used for one thing—computation. Computers became useful entities
because they were able to do numeric computation on an unprecedented scale. For
example, one of the first computers, ENIAC, was designed (although not used) for
the US Army to calculate artillery trajectories, a task made simpler through the use
of complex sequences of mathematical calculations. As such, originally, computers
were primarily a tool for mathematical and scientific research. Eventually, the use of
computers began to penetrate the business market, where the company's data itself
became just as important as computational speed. As the importance of this data
grew, so the need for data storage and management grew as well, and the concept
of a database was born.
The earliest databases were simple to envision. Most were simply large files that
were similar in concept to a spreadsheet or comma-separated values (CSV) file.
Data was stored as fields. A portion of these databases might look something like
the following:
Susan, Bates, 123 State St, Somewhere, VA
Fred, Hartman, 234 Banner Rd, Anywhere, CA
Bill, Frankin, 345 Downtown Rd, Somewhere, MO
Emily, Thompson, 456 Uptown Rd, Somewhere, NY
In this example, the first field is determined by reading left to right until a delimiter,
in this case a comma, is reached. This first field refers to the first name of the
individual. Similarly, the next field is determined by reading from the first delimiter
to the next. That second field refers to the last name of the individual. It continues
in this manner until we have five fields—first name, last name, street address, city,
and state. Each individual line or record in the file refers to the information for a
distinct individual. Because this data is stored in a file, it is often referred to as a flat
file database. To retrieve a certain piece of information, programs could be written
that would scan through the records for the requested information. In this way, large
amounts of data could be stored and retrieved in an orderly, programmatic way.
Limitations of the flat file paradigm
The flat file database system served well for many years. However, as time passed
and the demands of businesses to retain more data increased, the flat file paradigm
began to show some flaws.
In our previous example, our flat file is quite limited. It contains only five fields,
representing five distinct pieces of information. If this flat file database contained the
data for a real company, five distinct pieces of information would not even begin to
suffice. A complete set of customer data might include addresses, phone numbers,
information about what was ordered, when the order was placed, when the order
was delivered, and so on. In short, as the need to retain more data increases, the
number of fields grows. As the number of fields grows, our flat file database gets
wider and wider. We should also consider the amount of data being stored. Our first
example had four distinct records; not a very realistic amount for storing customer
data. The number of records could actually number in thousands or even millions.
Eventually, it is completely plausible that we could have a single flat file that is
hundreds of fields wide and millions of records long. We could easily find that the
speed with which our original data retrieval programs can retrieve the required data
is decreasing at a rapid rate and is insufficient for our needs.
SQL and Relational Databases
As our data demands increase, we're presented with another problem. If we are
storing order information, for example, strictly under the flat file paradigm, we are
forced to store a new record each time an order is placed. Consider this example,
in which our customer purchases six different items. We store a six-digit invoice
number, customer name, and address for the customer's purchase, as follows:
487345, Susan, Bates, 123 State St, Somewhere, VA
584793, Susan, Bates, 123 State St, Somewhere, VA
998347, Susan, Bates, 123 State St, Somewhere, VA
126543, Susan, Bates, 123 State St, Somewhere, VA
487392, Susan, Bates, 123 State St, Somewhere, VA
Using this example, notice how much duplicate data we have stored. The fields
are invoice number, first name, last name, street address, city, and state, respectively.
The only different piece of information in each record is the invoice number, and
yet we have repeatedly stored the last five fields—information that is stored in
previous records. We refer to these anomalies as repeating values. Repeating values
present two problems from a processing standpoint. First, the duplicate data must
be re-read each time by our retrieval programs, creating a performance problem
for our retrieval operations. Second, those duplicate characters constitute bytes that
must be stored on disk, uselessly increasing our storage requirements. It is clear that
the flat file paradigm needs to be revised in order to meet the growing demands of
our database.
Normalization
The world of databases changed in the early 1970s due in large part to the work of
Dr. Edgar "Ted" Codd. In his paper, A Relational Model of Data for Large Shared Data
Banks, Dr. Codd presented a new paradigm—the relational paradigm. The relational
paradigm seeks to resolve the issues of repeating values and unconstrained size by
implementing a process called normalization. During normalization, we organize
our data in such a way that the data and its inter-relationships can be clearly
identified. When we design a database, we begin by asking two questions—what
data do I have? And, how do the pieces of data relate to each other? In the first step,
the data is identified and organized into entities. An entity is any person, place,
or thing. An entity also has attributes, or characteristics, that pertain to it. Some
example entities are listed in the following diagram:
Structured Query Language
SQL was developed by Donald Chamberlain and Raymond Boyce in the early 1970s
as a language to retrieve data from IBM's early relational database management
systems. It was accepted as a standard by the American National Standards
Institute (ANSI) in 1986. SQL is generally referred to as a fourth-generation
language (4GL), in contrast with third-generation languages (3GLs) such as C,
Java, and Python. As a 4GL, the syntax of SQL is designed to be even closer to
human language than 3GLs, making it relatively natural to learn. Some do not
refer to SQL as a programming language at all, but rather a data sub-language.
A language for relational databases
Before we look at what SQL (pronounced either 'S-Q-L' or 'sequel') is, it is important
to define what it is not. First, SQL is not a product of Oracle or any other software
company. While most relational database products use some implementation of
SQL, none of them own it. The structure and syntax of SQL are governed by the
American National Standards Institute and the International Organization for
Standardization (ISO). It is these organizations that decide, albeit with input from
other companies such as Oracle, what comprises the accepted standard for SQL.
The current revision is SQL:2008.
Second, while the ANSI standard forms the basis for the various implementations
of SQL used in different database management systems, this does not mean that
the SQL syntax and functionality in all database products is the same; in fact, it is
often quite different. For instance, the SQL language permits the concatenation of
two column values into one; for example, the values hello and there concatenated
would be hellothere. Oracle and Microsoft SQL Server both use symbols to denote
concatenation, but they are different symbols. Oracle uses the double-pipe symbol,
'||', and SQL Server uses a plus sign, '+'. MySQL, on the other hand, uses a keyword,
CONCAT. Additionally, RDBMS software manufacturers often add functionalities to
their own SQL implementations. In Oracle version 10g, a new type of syntax was
included to join data from two or more tables that differs significantly from the ANSI
standard. Oracle does, however, still support the ANSI standard as well.
Read Book PDF More :
0 Comments