Difference between revisions of "Relational databases"

From MAPH09_2
Jump to: navigation, search
(Activity)
(Activity: Normalisation)
 
(17 intermediate revisions by 3 users not shown)
Line 8: Line 8:
 
http://documentation.openoffice.org/manuals/userguide3/0108GS3-GettingStartedWithBase.pdf
 
http://documentation.openoffice.org/manuals/userguide3/0108GS3-GettingStartedWithBase.pdf
  
This next reference refers to 'normalisation' http://www.island-data.com/downloads/papers/normalization.html. Read down to and including 'First Normal Form'.
+
This next reference refers to 'normalisation' http://www.island-data.com/downloads/papers/normalization.html. Read down to and including 'First Normal Form' to understand what normalisation is and why it is so important in relational database systems.
 
+
Please read the first few paragraphs to understand what this is and why it is so important in relational database systems.
+
  
 
The 'Entity-Relationship' model is the first thing to build when designing a database: please read this short introduction (down to 'Diagramming conventions') http://en.wikipedia.org/wiki/E-R_diagram
 
The 'Entity-Relationship' model is the first thing to build when designing a database: please read this short introduction (down to 'Diagramming conventions') http://en.wikipedia.org/wiki/E-R_diagram
Line 19: Line 17:
  
 
   
 
   
 
 
 
== Learning outcomes ==
 
== Learning outcomes ==
  
 
By the end of this session you will be able to:
 
By the end of this session you will be able to:
  
Explain the differences betwen "flat" and "relational" databases.  
+
*Explain the differences betwen "flat" and "relational" databases.  
  
Break a set of information down into distinct "entities" and justify your decisions.
+
*Break a set of information down into distinct "entities" and justify your decisions.
If you use iTunes you can look at the song information and explain how you might make a relational database from it.
+
  
Create a diagram that illustrates the relationships between the different "entities" in your data.  
+
*Create a diagram that illustrates the relationships between the different "entities" in your data.  
  
Use the diagram to construct a set of interlinked tables.
+
*Use the diagram to construct a set of interlinked tables.
  
 +
*Differentiate between primary and foreign keys.
  
To help you learn how to do this you have access to the Geekgirls.com databases tutorials dealing with: Database design from scratch [http://www.geekgirls.com/menu_databases.htm].
+
*Explain why normalistion and data preparation are important.
  
== Activity ==
 
  
1. Have a look at http://www.loc.gov/rr/print/coll/251_fen.html
 
Some of these images are referred to in the letters. Build a simple E-R diagram of your Fenton Letters database then build and populate another table in Base to hold images and a description of them, so that one letter can link to more than one image. Identify the primary and foreign keys - be prepared to explain the difference between these. Use the thumbnails from the Library of Congress site.
 
  
2. Take a look at http://erps.dmu.ac.uk.
+
== Activity: Adding page scans to your Fenton database ==
  
Examine this site and develop the E-R diagrams for each of the main components (exhibits, exhibitions, judges and exhibitors). Be as comprehensive as possible regarding the attributes of any entities you identify.
+
We have page scans of all of the letters in the two letterbooks. In this activity you are going to develop further the simple flat file database of Roger Fenton's Crimean War letters you built in the previous session by adding some of these page scans.
  
Select 2 of the main entities and describe the database tables diagrammatically. Identify primary and foreign keys.
+
Opening up your Fenton Letters database in Base, build another table to contain the page scan images and link it to the Letters table.
 +
 
 +
Populate this table with some of the page scans we will provide you with on a disc.
 +
 
 +
Draw a simple entity-relationship diagram to describe your Fenton Letters database.
 +
 
 +
== Activity: Adding photographs to your Fenton database ==
 +
 
 +
Now that you have got the hang of drawing E-R diagrams and linking tables in a database we are going to try something a little more complicated to produce something rather more useful. This time you are going to add images of Fenton's photographs relating to the letters. There is a good online collection of Fenton's war pictures on the Library of Congress Website. Click on the link to "View all the images" at http://www.loc.gov/rr/print/coll/251_fen.html to see thumbnail images. Some of the people, places and events depicted in these pictures are referred to in Fenton's letters. It's possible to match more than one picture to some of Fenton's letters.  It's also be possible to match more than one letter to some of his pictures.
 +
 
 +
Opening up your Fenton Letters database in Base, build and populate another table to hold some images of Fenton's photographs and a description of them, using the thumbnails and information from the Library of Congress site, so that one letter can link to more than one image and vice versa.
 +
 +
Identify the primary and foreign keys and be prepared to explain the difference between these in your seminar presentation.
 +
 
 +
Update your E-R diagram to describe all three tables.
 +
 
 +
 
 +
==Reflections on this activity ==
 +
 
 +
What is the main difference between the two versions of the database you have just built in terms of the relationships betwwen the letters and the other files you have linked them to?  Why is this difference important?
 +
 
 +
 
 +
== Activity: Normalisation ==
 +
 
 +
Now you have had a chance to see how a simple relational database works we are going to move on to look at arguably the most important aspect of database design: normalisation.  As you have seen from the required reading, normalisation is about removing abiguity from your data. 
 +
 
 +
Go to the Exhibitions of the Royal Photographic Society Website http://erps.dmu.ac.uk and search for "Abney". You should get results in the exhibits, exhibitor and judges tabs.  How many different Abney's are there?  How do you know?  How easy would it be for a computer to deduce this?  How could the data tables be constructed to enable the search for "Abney" to return these results?
 +
 
 +
== Activity: Data preparation ==
 +
 
 +
In the previous exercise you saw how it's important to remove ambiguity in data. Computers are not good with ambiguity (although there is a branch of computational intelligence that deals with "fuzzy logic" than can cope with ambiguity). Normalisation is part of a bigger cluster of tasks known as "data preparation" that is about getting data into a form that the computer can process.  Take a look at the exhibit prices listed on this catalogue page from the 1896 exhibition http://erps.dmu.ac.uk/catalogue_single.php?h=ca&pnum=E96A003&serial=16936&etid=127321&page=p. The prices are the information listed immediately after the processes, eg. 15/- 10/6, 2 guineas, etc.  If you dont know what these are take a look at these sites http://www.historyhome.co.uk/peel/economic/currency.htm and http://www.predecimal.com/predecimaldenominations.htm. Try to devise a system for recording the sale prices of all the exhibits in the 1896 exhibition so that it can be searched.
 +
 
 +
Now use the advance search function in erps to look for exhibits priced at 10/6.  Can you deduce how this search function works?
 +
 
 +
== Reflections on this activity ==
 +
 
 +
Why is data preparation important when building a database?
 +
 
 +
== Activity: Modelling complexity ==
 +
 
 +
Examine the erps site and develop E-R diagrams for each of the main components (exhibits, exhibitions, judges and exhibitors). Try to describe the attributes of all the entities you identify as precisely and comprehensively as possible.
 +
 
 +
Select two of the main entities and describe the database tables diagrammatically.  
 +
 
 +
Identify primary and foreign keys for these tables.
  
 
== Seminar preparation ==
 
== Seminar preparation ==
  
Prepare a brief presentation (maximum 5 wiki pages or Powerpoint slides, not counting references) on what you have done and what you have learned from this session.
+
Prepare a brief presentation (maximum 5 wiki pages or Powerpoint slides, not counting references) on what you have done and what you have learned from this session. Focus in particular on the key features of relational databases and the process for creating them. You will be expected to be able to use the terms: normalisation, data preparation, primary and foreign keys, entities, attributes.
 +
 
  
 
----
 
----

Latest revision as of 09:47, 27 November 2009

Erpshome.jpg

Required reading

Geekgirls.com databases: Database design from scratch. Step by step guides to designing and building databases including tutorials and exercises [1].

The guidance notes on Base: http://documentation.openoffice.org/manuals/userguide3/0108GS3-GettingStartedWithBase.pdf

This next reference refers to 'normalisation' http://www.island-data.com/downloads/papers/normalization.html. Read down to and including 'First Normal Form' to understand what normalisation is and why it is so important in relational database systems.

The 'Entity-Relationship' model is the first thing to build when designing a database: please read this short introduction (down to 'Diagramming conventions') http://en.wikipedia.org/wiki/E-R_diagram

Introduction

So far we have been working with a small, simple data set but useful research resources are often larger and more complex than this. See for example the Correspondence of William Henry Fox Talbot at http://foxtalbot.dmu.ac.uk or Exhibitions of the Royal Photographic Society Web site at http://erps.dmu.ac.uk. Collections like these are more difficult to manage in a single table and it's much harder to sort through all the information they contain to find what you are looking for or to see connections between different bits without some additional help. In this session you will learn how to use "relational" databases to manage large/complex sets of information and you will have the opportunity to model a small relational database.


Learning outcomes

By the end of this session you will be able to:

  • Explain the differences betwen "flat" and "relational" databases.
  • Break a set of information down into distinct "entities" and justify your decisions.
  • Create a diagram that illustrates the relationships between the different "entities" in your data.
  • Use the diagram to construct a set of interlinked tables.
  • Differentiate between primary and foreign keys.
  • Explain why normalistion and data preparation are important.


Activity: Adding page scans to your Fenton database

We have page scans of all of the letters in the two letterbooks. In this activity you are going to develop further the simple flat file database of Roger Fenton's Crimean War letters you built in the previous session by adding some of these page scans.

Opening up your Fenton Letters database in Base, build another table to contain the page scan images and link it to the Letters table.

Populate this table with some of the page scans we will provide you with on a disc.

Draw a simple entity-relationship diagram to describe your Fenton Letters database.

Activity: Adding photographs to your Fenton database

Now that you have got the hang of drawing E-R diagrams and linking tables in a database we are going to try something a little more complicated to produce something rather more useful. This time you are going to add images of Fenton's photographs relating to the letters. There is a good online collection of Fenton's war pictures on the Library of Congress Website. Click on the link to "View all the images" at http://www.loc.gov/rr/print/coll/251_fen.html to see thumbnail images. Some of the people, places and events depicted in these pictures are referred to in Fenton's letters. It's possible to match more than one picture to some of Fenton's letters. It's also be possible to match more than one letter to some of his pictures.

Opening up your Fenton Letters database in Base, build and populate another table to hold some images of Fenton's photographs and a description of them, using the thumbnails and information from the Library of Congress site, so that one letter can link to more than one image and vice versa.

Identify the primary and foreign keys and be prepared to explain the difference between these in your seminar presentation.

Update your E-R diagram to describe all three tables.


Reflections on this activity

What is the main difference between the two versions of the database you have just built in terms of the relationships betwwen the letters and the other files you have linked them to? Why is this difference important?


Activity: Normalisation

Now you have had a chance to see how a simple relational database works we are going to move on to look at arguably the most important aspect of database design: normalisation. As you have seen from the required reading, normalisation is about removing abiguity from your data.

Go to the Exhibitions of the Royal Photographic Society Website http://erps.dmu.ac.uk and search for "Abney". You should get results in the exhibits, exhibitor and judges tabs. How many different Abney's are there? How do you know? How easy would it be for a computer to deduce this? How could the data tables be constructed to enable the search for "Abney" to return these results?

Activity: Data preparation

In the previous exercise you saw how it's important to remove ambiguity in data. Computers are not good with ambiguity (although there is a branch of computational intelligence that deals with "fuzzy logic" than can cope with ambiguity). Normalisation is part of a bigger cluster of tasks known as "data preparation" that is about getting data into a form that the computer can process. Take a look at the exhibit prices listed on this catalogue page from the 1896 exhibition http://erps.dmu.ac.uk/catalogue_single.php?h=ca&pnum=E96A003&serial=16936&etid=127321&page=p. The prices are the information listed immediately after the processes, eg. 15/- 10/6, 2 guineas, etc. If you dont know what these are take a look at these sites http://www.historyhome.co.uk/peel/economic/currency.htm and http://www.predecimal.com/predecimaldenominations.htm. Try to devise a system for recording the sale prices of all the exhibits in the 1896 exhibition so that it can be searched.

Now use the advance search function in erps to look for exhibits priced at 10/6. Can you deduce how this search function works?

Reflections on this activity

Why is data preparation important when building a database?

Activity: Modelling complexity

Examine the erps site and develop E-R diagrams for each of the main components (exhibits, exhibitions, judges and exhibitors). Try to describe the attributes of all the entities you identify as precisely and comprehensively as possible.

Select two of the main entities and describe the database tables diagrammatically.

Identify primary and foreign keys for these tables.

Seminar preparation

Prepare a brief presentation (maximum 5 wiki pages or Powerpoint slides, not counting references) on what you have done and what you have learned from this session. Focus in particular on the key features of relational databases and the process for creating them. You will be expected to be able to use the terms: normalisation, data preparation, primary and foreign keys, entities, attributes.



Follow this link to the Photography Resources in a Digital Age Contents page to return to the module contents.

Back to the MA: Photographic History Main Page to return to the Course contents.