ecosmak.ru

Database classification according to the way data is organized. Classification and characteristics of databases

2.1. Definitions and concepts of database theory

Database (DB, database) is a named collection of structured data related to a specific subject area.

A subject area is some part of a real-life system that functions as an independent unit.

The full subject area may represent the economy of a country or a group of allied states, but in practice for information systems highest value has a subject area of ​​the scale of an individual enterprise or corporation.

A database management system (DBMS) is a set of software and language tools necessary for creating and modifying a database, adding, modifying, deleting, searching and selecting information, presenting information on a screen and in printed form, delimiting access rights to information, and performing other base operations.

A relational database is the main type of modern databases. Consists of tables between which there can be relationships by key values.

A database table (table) is a regular structure that consists of rows of the same type (records) divided into columns (fields).

In relational database theory, a table is synonymous with a relation, in which a row is called a tuple and a column is called an attribute.

In the conceptual model of a relational database, an analogue of a table is an entity (entity), with a certain set of properties - attributes that can take on certain values ​​(a set of valid values ​​- a domain).

The key element of the table (key, regular key) is its field (simple key) or a string expression formed from the values ​​of several fields (composite key), by which you can determine the values ​​of other fields for one or more table records. In practice, to use keys, indexes are created - service information containing ordered information about key values. In relational theory and the conceptual model, the concept of "key" is applied to the attributes of a relation or entity.

The primary key is the main key element that uniquely identifies a row in a table. There may also be alternative (candidate key) and unique (unique key) keys that also serve to identify rows in the table.

In relational theory, a primary key is a minimal set of attributes that uniquely identifies a tuple in a relation.

In the conceptual model, the primary key is the minimum set of entity attributes that uniquely identifies an instance of an entity.

Communication (relation) — functional dependence between objects. In relational databases, relationships are established between tables by keys, one of which in the main (parent, parent) table is primary, the second is a foreign key - in the external (child, child) table, as a rule, is not primary and forms a relationship "one to many" (1:N). In the case of a primary foreign key, the relationship between tables is one-to-one (1:1). Link information is stored in a database.

Foreign key (foreign key) - a key element of a subordinate (external, child) table, the value of which matches the value of the primary key of the main (parent) table.

Referential integrity is a set of rules that match key values ​​in related tables.

Stored procedures are program modules that are stored in a database to perform certain operations on database information.

Triggers are stored procedures that ensure that data referential integrity conditions are met in the operations of changing primary keys (possibly cascading data), deleting records in the main table (cascading deleting in child tables), and adding records or changing data in child tables.

Object (object) - an element of the information system that has certain properties (properties) and reacts in a certain way to external events (events).

A system is a set of objects interacting with each other and with the external environment.

Database replication is the creation of database copies (replicas) that can exchange updated data or replicated forms, reports, or other objects as a result of the synchronization process.

A transaction is a change in information in the database as a result of performing one operation or a sequence of operations, which must be performed completely or not performed at all. The DBMS has special mechanisms for ensuring transactions.

SQL (Structured Query Language) is a universal language for working with databases, including the possibility of creating it, modifying the structure, selecting data by query, modifying information in the database, and other database manipulation operations.

Null is the value of a table field, indicating that there is no information in this field. Permission to allow a Null value to exist can be set for individual fields in a table.

2.2. Database classification

According to the data processing technology, databases are divided into centralized and distributed.

The centralized database is stored in the memory of one computer system. This computing system can be a mainframe - then access to it is organized using terminals - or a file server on a PC local network.

A distributed database consists of several, possibly intersecting or even duplicating each other, parts that are stored in different computers in a computer network. Work with such a database is carried out using a distributed database management system (RDBMS).

According to the method of accessing data, databases are divided into databases with local access and databases with network access.

For all modern databases, you can organize network access with a multi-user mode of operation.

Centralized databases with network access can have the following architecture:

  • file server;
  • database client-server;
  • "thin client" - application server - database server (three-tier architecture).

Rice. 1. Scheme of working with a database in a local network with a dedicated file server

File server. The architecture of database systems with network access assumes the allocation of one of the network machines as a central one (file server). An operating system (OS) for a dedicated server is installed on this computer (for example, Microsoft Windows Server 2003). It also stores a shared centralized database in the form of one or a group of files. All other computers on the network act as workstations (they can run Microsoft Windows 2000 Professional or Microsoft Windows 98). Database files in accordance with user requests are transferred to workstations, where information is processed (Fig. 1). With a high intensity of access to the same data, the performance of the information system drops. Users can also create local databases on workstations.

Rice. 2. Scheme of working with the database in the "Client-server" architecture

Client-server. In this architecture, on a dedicated server running a server operating system, special software (software) is installed - a database server, such as Microsoft® SQL Serveror Oracle. The DBMS is divided into two parts: client and server. The basis of the database server is the use of the query language (SQL). An SQL query sent by the client (workstation) to the database server generates a search and retrieval of data on the server. The extracted data is transported across the network from the server to the client (Figure 2). Thus, the amount of information transmitted over the network is reduced many times over.

The three-tier architecture functions in intranet and Internet networks. The client part ("thin client") that interacts with the user is an HTML page in a Web browser or a Windows application that interacts with Web services. All program logic is placed on the application server, which ensures the formation of database queries that are submitted to the database server for execution. The application server can be a web server or specialized program(for example, Oracle Forms Server) (Fig. 3).

Rice. 3. Scheme of working with the database in a three-level architecture

2.3. Hierarchical and network data models

In a hierarchical data model, there is one main object and the rest - subordinate - objects located at different levels of the hierarchy. Object relationships form a hierarchical tree with one root object.

A hierarchical database consists of an ordered set of multiple instances of the same type of tree. Referential integrity between ancestors and descendants is automatically maintained. Basic rule: no child can exist without its parent (Fig. 4).

Rice. 4. Diagram of the hierarchical data model

A typical representative (the most famous and widespread) is the Information Management System (IMS) from IBM. The first version appeared in 1968. Many databases of this system are still supported.

Network databases

The network approach to data organization is an extension of the hierarchical one. In hierarchical structures, a descendant entry must have exactly one parent; in a network data structure, a child can have any number of ancestors.

In the network data model, any object can be both master and slave at the same time, and can participate in the formation of any number of relationships with other objects. A network database consists of a set of records and a set of links between these records, or more precisely, a set of instances of each type from the set of record types specified in the database schema and a set of instances of each type from a given set of link types (Fig. 5).

Rice. 5. Diagram of the network data model

A typical representative is the Integrated Database Management System (IDMS) from Cullinet Software, Inc., designed for use on IBM's mainstream machines running most operating systems. The architecture of the system is based on proposals from the Data Base Task Group (DBTG) of the Committee on Programming Languages ​​of the Conference on Data Systems Languages ​​(CODASYL), the organization responsible for defining the Cobol programming language. The DBTG report was published in 1971, and later several systems appeared, among which IDMS.

2.4. Relational databases

Relational systems did not immediately become widespread. While the main theoretical results in this area were obtained back in the 70s and the first prototypes of relational DBMS appeared at the same time, for a long time it was considered impossible to achieve an effective implementation of such systems. However, the gradual accumulation of methods and algorithms for organizing relational databases and managing them led to the fact that already in the mid-80s, relational systems practically ousted early DBMS from the world market.

The relational data model is based on mathematical principles that follow directly from set theory and predicate logic. These principles were first applied to the field of data modeling in the late 1960s. by Dr. E. F. Codd, then at IBM, and first published in 1970.

The technical paper "The Relational Data Model for Large Shared Data Banks" by Dr. E. F. Codd, published in 1970, is the forerunner of modern relational database theory. Dr. Codd defined 13 rules for the relational model (called the thirteen Codd rules).

The 13 Rules of Codd

  1. A relational DBMS must be able to fully manage the database through its relational capabilities.
  2. Information Rule - All information in a relational database (including table and column names) must be defined strictly as values ​​in tables.
  3. Guaranteed Access - Any value in a relational database must be guaranteed to be available for use through a combination of table name, primary key value, and column name.
  4. Null value support - The DBMS must be able to work with null values ​​(unknown or unused values), as opposed to default values ​​and independently for any domains.
  5. Online relational catalog - the description of the database and its contents should be represented at the logical level as tables, to which queries can be applied using the database language.
  6. Comprehensive Data Management Language - At least one of the supported languages ​​must have a well-defined syntax and be comprehensive. It must support data structure description and manipulation, integrity rules, authorization, and transactions.
  7. View update rule - all views that are theoretically updateable can be updated through the system.
  8. Insert, Update, and Delete - The DBMS supports not only a query for data selection, but also insert, update, and delete.
  9. Physical data independence - application programs and special programs are not logically affected by changes physical methods data access and data storage structures.
  10. Logical data independence - application programs and special programs are not logically affected, within reason, by changes in table structures.
  11. Integrity Independence − The database language must be able to define integrity rules. They must be stored in the online directory and there must be no way around them.
  12. Distribution Independence - Application programs and special programs are not logically affected by whether the data is used the first time or reused.
  13. Continuity - the inability to circumvent the integrity rules defined through the database language by using languages low level.

The basic idea of ​​relational algebra is that since relations are sets, the means of manipulating relations can be based on traditional set-theoretic operations, supplemented by some special operations, specific to relational databases.

There are many approaches to the definition of relational algebra, which differ in the sets of operations and the way they are interpreted, but, in principle, are more or less equivalent. An extended initial version of the algebra, which was proposed by Codd, is called the Codd algebra.

In this version, the set of basic algebraic operations consists of eight operations, which are divided into two classes - set-theoretic operations and special relational operations. The set-theoretic operations include the following operations:

  • association of relations;
  • relationship intersections;
  • taking the difference of ratios;
  • taking the Cartesian product of the ratios.

Special relational operations include:

  • relationship restriction;
  • attitude projection;
  • connection of relations;
  • division of relationships.

In addition, the algebra includes the assignment operation, which allows saving the results of the calculation of algebraic expressions in the database, and the attribute renaming operation, which makes it possible to correctly form the title (scheme) of the resulting relation.

  • When performing the UNION operation of two relations with the same headers, a relation is produced that includes all tuples that are included in at least one of the relations - operands.
  • The operation of intersection (INTERSECT) of two relations with the same headings produces a relation that includes all tuples that are included in both operand relations.
  • A relation that is the difference (MINUS) of two relations with the same headings includes all tuples in the first operand relation such that none of them is in the second operand relation.
  • Performing the Cartesian product (TIMES) of two relations whose headings intersect is empty produces a relation whose tuples are produced by concatenating the tuples of the first and second operands.
  • The result of a constraint (WHERE) on a relation by some condition is a relation that includes tuples of the operand relation that satisfies that condition.
  • When performing a projection (PROJECT) of a relation onto a given subset of the set of its attributes, a relation is produced whose tuples are the corresponding subsets of the tuples of the relation-operand.
  • When joining (JOIN) two relations according to some condition, the resulting relation is formed, the tuples of which are produced by combining the tuples of the first and second relations and satisfy this condition.
  • The relational division operator (DIVIDE BY) has two operands - binary and unary relations. The resulting relation consists of unary tuples that include the values ​​of the first attribute of the tuples of the first operand such that the set of values ​​of the second attribute (with a fixed value of the first attribute) includes the set of values ​​of the second operand.
  • The rename operation (RENAME) produces a relation whose body is the same as the body of the operand, but the attribute names have been changed.
  • The assignment operator (:=) allows you to store the result of evaluating a relational expression in an existing database relation.

Codd proposed the use of relational algebra in an RDBMS to break down data into related sets. He organized his DB system around a concept based on datasets.

In the relational model, data is broken down into sets that make up a tabular structure. This table structure consists of individual data elements called fields. A single set or group of fields is known as a record.

The data model, or conceptual description of the subject area, is the most abstract level of database design.

From the point of view of the theory of relational databases, the basic principles of the relational model at the conceptual level can be formulated as follows:

  • all data is represented as an ordered structure, defined as rows and columns, and called a relation;
  • all values ​​are scalars. This means that for any row and column of any relation, there is one and only one value;
  • all operations are performed on an integer relation, and the result of their execution is also an integer relation. This principle is called closure.

When formulating the principles of the relational model, Dr. Codd chose the term "relationship" (relation) because, in his opinion, this term is unambiguous (while, for example, the term "table" has many different types - a table in the text, a spreadsheet, and etc.). A common misconception is that the relational model is so named because it defines relationships between tables. In fact, the name of this model comes from the relationships (database tables) that underlie it.

Each row containing data is called a tuple, each column of a relation is called an attribute (at the level of practical work with modern relational databases, the terms “record” and “field” are used).

The elements of the description of the relational data model at the conceptual level are entities, attributes, domains and relationships.

An entity is some separate object or event, information about which must be stored in the database, which has a certain set of properties - attributes. Entities can be both physical (real-life objects: for example, STUDENT, attributes - grade book number, surname, his faculty, specialty, group number, etc.), and abstract (for example, EXAM, attributes - discipline, date, teacher, audience, etc.). Entities are distinguished by their type and instance. A type is characterized by a name and a list of properties, while an instance is characterized by specific property values.

Entity attributes are:

  1. identifying and descriptive. Identifying attributes have a unique value for entities of a given type and are potential keys. They allow you to uniquely recognize instances of an entity. One primary key (PC) is selected from the candidate keys. As a PC, a potential key is usually chosen, which is used more often to access record instances. The PC should include the minimum number of attributes required for identification. The remaining attributes are called descriptive.
  2. Simple and compound. A simple attribute consists of one component, its value is indivisible. A compound attribute is a combination of several components, possibly belonging to different types data (for example, address). The decision whether to use a composite attribute or split it into components depends on the specific processes of its use and may be associated with high speed work with large databases.
  3. One-valued and multi-valued- can have respectively one or many values ​​for each instance of the entity.
  4. Basic and derivative. The value of the main attribute does not depend on other attributes. The value of a derived attribute is calculated from the values ​​of other attributes (for example, a person's age is calculated from their date of birth and the current date).

An attribute specification consists of its name, data type, and integrity constraints, the set of values ​​(or domain) that the attribute can take.

A domain is the set of all valid values ​​that an attribute can contain. The concept of "domain" is often confused with the concept of "data type". It is necessary to distinguish between these two concepts. A data type is a physical concept, while a domain is a logical one. For example, "integer" is the data type, and "age" is the domain.

Relationships - at a conceptual level, they are simple associations between entities. For example, the statement "Customers purchase products" indicates that there is a relationship between the entities "Customers" and "Products", and such entities are called members of this relationship.

There are several types of relationships between two entities: they are one-to-one, one-to-many, and many-to-many relationships.

Each relationship in the relational model is characterized by a name, a requirement, a type, and an extent. Distinguish between optional and mandatory links. If an entity of one type is necessarily related to an entity of another type, then there is a mandatory relationship between these types of objects (denoted by a double line). Otherwise, the connection is optional.

The degree of relationship is determined by the number of entities that are covered by this relationship. An example of a binary relationship is the relationship between a department and the employees who work in it.

The Entity-Relationship diagrams, or E/R diagram, is used to describe the base schema at the conceptual design level. The method was proposed in 1976 by Peter Pin Shan Chen. Entity-relationship diagrams show entities as rectangles, attributes as ellipses, and relationships as diamonds (Figure 6).

Rice. 6. Entity-relationship diagram

Subsequently, many authors developed their own versions of such models (Martin notation, IDEF1X notation, Barker notation, etc.). In addition, different software tools that implement the same notation may differ in their capabilities. In fact, all variants of entity-relationship diagrams come from the same idea - a picture is always clearer than a textual description. All such diagrams use a graphical representation of the entities of the subject area, their properties (attributes) and relationships between entities.

The design of the database schema should solve the problem of minimizing data duplication, simplifying and accelerating the procedures for processing and updating them. An incorrectly designed database schema can lead to data modification anomalies. To solve such problems, normalization of relations is carried out.

However, in the technology of working with data warehouses, the reverse technique can be used - denormalization of relations in order to increase the speed of executing queries on very large volumes of archived data.

Within the framework of the relational data model, E. F. Codd developed the principles of normalization of relations and proposed a mechanism that allows any relation to be converted to the third normal form.

Normalization is a formal method for analyzing relationships based on their primary key and existing relationships. Its task is to replace one database schema (or set of relationships) with another schema in which the relationships have a simpler and more regular structure.

When working with a relational model, to create relations of acceptable quality, it is enough to fulfill the requirements of the first normal form.

First normal form (1NF) is concerned with the concepts of simple and complex attributes. A simple attribute is an attribute whose values ​​are atomic (i.e., indivisible). A complex attribute can have a value that is the union of multiple values ​​from the same or different domains. In the first normal form, duplicate attributes or groups of attributes are eliminated, i.e., implicit entities “disguised” as attributes are identified.

A relation is reduced to 1NF if all of its attributes are simple, i.e. the attribute value must not be a set or a repeating group.

To bring tables to 1NF, it is necessary to break complex attributes into simple ones, and move multivalued attributes into separate relations.

Second Normal Form (2NF) applies to relationships with composite keys (consisting of two or more attributes) and is related to the concepts of functional dependency.

If at any time each value of attribute A corresponds to a single value of attribute B, then B is functionally dependent on A (AB). Attribute (attribute group) A is called a determinator.

In second normal form, attributes that depend on only part of the unique key are eliminated. This part of the unique key identifies a single entity.

A relation is in 2NF if it is reduced to 1NF and every non-key attribute is fully functionally dependent on the composite primary key.

Third normal form (3NF) is related to the notion of transitive dependency. Let A, B, C be attributes of some relation. Moreover, A B and B C, but there is no inverse correspondence, i.e., C does not depend on B or B does not depend on A. Then we say that C transitively depends on A (A C).

Third normal form eliminates attributes that depend on attributes that are not part of the unique key. These attributes are the basis of a single entity.

A relation is in 3NF if it is in 2NF and has no attributes that are not in the primary key and are transitively dependent on the primary key.

There are also Boyce-Codd Normal Form (BCNF), 4NF and 5NF. However, 1NF is of the greatest importance, since subsequent NFs deal with the concepts of composite keys and complex key dependencies, and in practice there are usually simpler cases.

Modeling the database structure using the normalization algorithm has serious disadvantages:

  1. The normalization technique involves the initial placement of all attributes of the projected subject area in one relation, which is a very unnatural operation. Intuitively, the developer designs several relationships at once according to the discovered entities. Even if you commit violence against yourself and create one or more relationships, including all the alleged attributes in them, then the meaning of the resulting relationship is completely unclear.
  2. It is not possible to immediately determine the full list of attributes. Users have a habit of calling different names the same things or, on the contrary, call different things by the same names.
  3. To carry out the normalization procedure, it is necessary to isolate the dependencies of the attributes, which is also very difficult.

In the real design of the database structure, another method is used - the so-called semantic modeling. Semantic modeling is the modeling of data structures based on the meaning of these data. As a semantic modeling tool, various versions of entity-relationship diagrams (ERD) are used with the construction of a conceptual database model.

Any professional who has general principles optimal organization of relational databases, able to build a model that does not contradict the principles of normalization.

A relational database at the physical level consists of tables, between which there can be relationships by key values. Along with tables and information about relationships in a relational database, there may be "stored procedures" and, in particular, "triggers" that ensure compliance with the conditions of referential integrity of the database.

Compliance with referential integrity conditions in a relational database

The rule of matching foreign keys with primary ones is the main rule for observing the conditions of referential integrity. For each foreign key value, there must be a corresponding primary key value in the parent table.

Referential integrity can be broken as a result of insert (add), update, and delete operations in tables. There are two tables involved in referential integrity, parent and child, and these operations are possible for each of them, so there are six different options that may or may not lead to a violation of referential integrity.

For parent table:

  • Insert. There is a new primary key value. The existence of records in the parent table that are not referenced from the child table is allowed, the operation does not violate referential integrity.
  • Update. Changing the value of the primary key in a record can lead to a violation of referential integrity.
  • Removal. Deleting a record deletes the value of the primary key. If there are records in the child table that refer to the key of the record being deleted, then the foreign key values ​​will become invalid. The operation can lead to a violation of referential integrity.

For child table:

  • Insert. You cannot insert a record into a child table if new entry foreign key value is invalid. The operation can lead to a violation of referential integrity.
  • Update. When you update a record in a child table, you can try to incorrectly change the foreign key value. The operation can lead to a violation of referential integrity.
  • Removal. Deleting a record in a child table does not break referential integrity.

Thus, referential integrity can in principle be violated by one of four operations:

  1. Update records in the parent table.
  2. Deleting records in the parent table.
  3. Inserting records in a child table.
  4. Update records in child table.

Basic Strategies for Maintaining Referential Integrity

There are two main strategies for maintaining referential integrity.

RESTRICT (RESTRICT) - do not allow the execution of an operation that leads to a violation of referential integrity.

CASCADE (CASCADE CHANGE) - allow the required operation to be performed, but make the necessary changes in the related tables so as to prevent violation of referential integrity and preserve all existing relationships. The change starts in the parent table and cascades through the child tables. There is one subtlety in the implementation of this strategy, which is that the child tables themselves can be parent to some third tables. This may additionally require the execution of some strategy for this connection, etc. If, in this case, any of the cascading operations (of any level) cannot be performed, then it is necessary to abandon the initial operation and return the database to its original state . This is a complex strategy, but it does not break the relationships between parent and child tables.

These strategies are standard and are present in all DBMSs that support referential integrity.

Additional Strategies for Maintaining Referential Integrity

IGNORE (IGNORE) - allow the operation to be performed without checking the referential integrity. In this case, incorrect foreign key values ​​may appear in the child table, and the entire responsibility for the integrity of the database lies with the programmer or user.

SET NULL - Allow the required operation to be performed, but change any resulting incorrect foreign key values ​​to null values. This strategy has two drawbacks. First, it requires permission to use null values. Second, the records of the child table lose their relationship with the records of the parent table. It is no longer possible to determine with which record of the parent table the changed records of the child table were associated after the operation.

SET DEFAULT (SET THE DEFAULT VALUE) - allow the required operation to be performed, but change any incorrect foreign key values ​​that occur to some default value. The advantage of this strategy over the previous one is that it avoids using null values. It is also impossible to determine with which records of the parent table the changed records of the child table were associated after performing such an operation.

On fig. Figure 7 shows an example of a relational database containing information from the personnel department on employees of an enterprise, in which for each table a list of its fields is shown and relationships between tables are shown by a simple key - the value of the tabn field.

Rice. 7. Relational database schema

Since the 1980s, along with the widespread use of personal computers, the so-called "desktop" relational DBMS (Desktop Databases), such as dBase, FoxBase (its later versions - FoxPro and Visual FoxPro), Paradox, Access . The most common format for tables of such relational databases was *.dbf, with which dBase, FoxBase, and also Clipper worked - a system for writing programs (in string compiler mode) for working with databases. Subsequently, some of them became full-fledged network DBMSs that work not only on various operating systems in the file-server architecture, but also have the ability to work with database servers in the client-server architecture, as well as develop and use html- database pages.

All DBMS for PC can be divided into three types:

  1. Database management systems in the literal sense of the term, for which work with databases is possible only after the launch of this system without the possibility of creating stand-alone programs that work with databases. These systems include: Access, Paradox, dBase.
  2. Systems that have both tools for working with databases and the ability to develop user programs (applications) executable in the operating system, i.e. software developer tools - FoxPro.
  3. Systems for developing user programs for working with databases - Clipper, Clarion.

All such DBMS include tools for:

  • creating databases and modifying their structure; creating index files;
  • work with databases in a tabular format or in the form of a standard form with the arrangement of fields line by line; it is possible to edit data, add records, delete records, work with data from several database tables, calculate complex expressions for given conditions, etc.;
  • development of screen forms that, in addition to editable fields associated with a database or memory variables, also have controls different kind in the form of buttons; more complex objects such as drop-down lists, etc.;
  • generation of printed forms - reports of a complex structure with data grouping, with the receipt of calculated values ​​and totals by groups and totals (sum, number, average, maximum, minimum, etc.);
  • development of software modules for complex data processing;
  • generating queries of a very complex structure - using data from various databases, setting complex conditions for selecting data, sorting and grouping data.

In developer-oriented systems, it is additionally possible to develop a menu, a help system, and a project that includes all of the above components and is compiled into an executable program.

Important factors determining the choice of a DBMS are:

  • A database format that allows information to be exchanged with other operating system applications. One of the most common formats is the dbf format, which dBase, FoxBase, FoxPro, Visual FoxPro, Clipper work with. It is "understood" by all MS Office applications. Data from these databases can be transferred to Word, Excel, Access. Clarion, Paradox, Access have their own data formats.
  • Ensuring the secrecy and confidentiality of data have systems that are not focused on the developer of programs: Access, Paradox. However, this factor can be implemented when storing data on a dedicated server, where the rights of different users can be easily distinguished.

All modern DBMS support operating modes in the local network of many users with one database. Some have "wizards", "builders" and "expression generators" for accelerated development of databases, screen forms, reports, standard applications.

The latest versions of the DBMS designed to work in OC Windows 95 belong to the class of RAD-systems (Rapid Application Development) - tools for rapid application development - and have an object-oriented programming language. These are systems such as Visual FoxPro, MS Access, Visual dBase, etc.

Post-relational databases

At present, so-called postrelational DBMS are also known, which are based on a data model in the form of multidimensional tables (for example, in the Cache system of InterSystems Corporation) and the widespread use of the principles of an object-oriented approach in organizing databases and programming.

Database servers

Servers are widely used in local and global computer networks: computers and software tools for servicing clients - workstations and / or other servers.

Examples of servers might be:

  • a file server that maintains a shared file storage for all workstations;
  • an Internet server that provides information on the global Internet;
  • a mail server that provides work with e-mail;
  • database server - a DBMS that receives requests over a local network and returns information corresponding to the request.

The term "database server" is usually used to refer to the entire DBMS based on the "client-server" architecture, including both the server and client parts. The most common servers are currently Microsoft SQL Server, Oracle, IBM DB2 Universal DataBase, Informix, etc. The size of one database on these servers can reach one million terabytes.

2.5. Distributed databases

The main task of distributed database management systems is to provide a means of integrating local databases located in some nodes of a computer network so that a user working in any node of the network has access to all these databases as a single database.

Homogeneous and heterogeneous distributed databases are possible. In the homogeneous case, each local database is managed by the same DBMS. In a heterogeneous system, local databases may even belong to different data models. Network integration of heterogeneous databases is very difficult problem. Many solutions are known at the theoretical level, but so far it has not been possible to cope with the main problem: the insufficient efficiency of integrated systems. An intermediate task is solved more successfully - the integration of heterogeneous SQL-oriented systems. This is largely facilitated by the standardization of the SQL language.

An example of a distributed DBMS is System R*. In this system, application developers and end users remain in the SQL language environment. The ability to use SQL relies on System R* being transparent about the location of the data. The system automatically detects the current location of the data objects mentioned in the user's request; the same application program, including SQL statements, can be executed in different network nodes. At the same time, in each network node, at the query compilation stage, the most optimal query execution plan is selected in accordance with the location of the data in the distributed system.

Reader

Job title annotation

Workshops

Workshop name annotation

Presentations

Title of the presentation annotation
Presentations for topic 2

Speaking in more detail about the classification of databases by the nature of the information stored, we will mention factual and documentary.

In systems of a factographic type, the database stores information about the objects of the subject area that are of interest to the user in the form of “facts” (for example, biographical data about employees, data on the output of products by manufacturers, etc.). In response to a user request, the required information about the object (objects) of interest to him or a message that the required information is not in the database is issued.

In documentary databases, a storage unit is a document (for example, the text of a law or an article), and in response to his request, the user is given either a link to the document or the document itself, in which he can find the information he is interested in.

Documentary-type databases can be organized in different ways: without storage and with storage of the original document itself on machine media. The systems of the first type include bibliographic and abstract databases, as well as database indexes that refer to the source of information. Systems that provide for the storage of the full text of a document are called full-text.

In document-type systems, the search target can be not only some information stored in documents, but also the documents themselves. So, queries like “how many documents were created in a certain period of time”, etc. are possible. Often, the search criteria includes “the date of acceptance of the document”, “by whom” and other “output data” of documents as features.

Database classification by data storage method

Speaking in more detail about databases classified according to the nature of information storage, we will highlight that centralized and distributed databases imply the possibility of simultaneous access of several users to the same information (multi-user, parallel access mode). This introduces specific problems in their design and during the operation of the database.

Figure 4 - Example of a centralized database

Distributed databases also have characteristics related to the fact that physically different parts of the database can be located on different computers, but logically, from the user's point of view, they must be a single whole.


Figure 5 - Example of distributed databases

Database software is called a database management system (DBMS).

The concept of “DBMS”

A database management system is a set of language and software tools that accesses data, allows them to be created, modified and deleted, ensures data security, etc. In general, a DBMS is a system that allows you to create databases and manipulate information from them. And it performs this access to the DBMS data through a special language - SQL.

SQL is a structured query language whose main task is to provide easy way reading and writing information to the database.

So, the simplest circuit working with the database looks like this:


Figure 6 - Scheme of working with the database

Classification by data model:

  • 1. Hierarchical- this is a data model that uses a database representation in the form of a tree (hierarchical) structure consisting of objects (data) various levels. There are links between objects, each object can include several objects of a lower level. Such objects are in relation to an ancestor (an object closer to the root) to a descendant (an object of a lower level), while it is possible that the ancestor object has no children or has several of them, while the child object must have only one ancestor. Objects that have a common ancestor are called twins (in programming, in relation to the data structure, a tree is called brothers);
  • 2. Object and object-oriented- are a database management system in which information is presented in the form of objects, used in object-oriented programming. Object databases are different from relational databases, which are table-oriented. Object relational databases are a hybrid of both approaches. Object databases were considered in the early 1980s;
  • 3. Object-relational DBMS (ORDBMS)-- a relational DBMS (RDBMS) that supports some technologies that implement an object-oriented approach: objects, classes and inheritance are implemented in the database structure and query language.

Object-relational DBMS are, for example, the well-known Oracle Database, Informix, DB2, PostgreSQL;

  • 4. Relational data model (RMD)-- a logical data model, an applied theory of database construction, which is an application to the problems of data processing of such sections of mathematics as set theory and first-order logic. Relational databases are built on the relational data model;
  • 5. network data model-- a logical data model, which is an extension of the hierarchical approach, a rigorous mathematical theory that describes the structural aspect, the aspect of integrity and the aspect of data processing in network databases;
  • 6. functional data model uses this approach to define an object. Instead of representing an object as a record with a specific content, or as a tuple in a B-tree, the functional model tells what functions (or operations) are defined on that object. The representation of an object is an implementation matter and is defined at a lower level of abstraction.

Classification by Persistent Storage Environment:

  • 1. In secondary memory, or conventional (eng. conventional database) - persistent storage medium is a peripheral non-volatile memory (secondary memory), usually a hard disk.
  • 2. The DBMS places only the cache and data for current processing in the RAM;
  • 3. In random access memory(English in-memory database, memory-resident database, main memory database) - all data at the execution stage are in RAM;
  • 4. in tertiary memory(English tertiary database) - a persistent storage medium is a mass storage device (tertiary memory) detached from the server, usually based on magnetic tapes or optical disks. The server's secondary memory stores only the tertiary memory data directory, file cache, and data for current processing; loading the data itself requires a special procedure.

Content classification:

  • 1. Geographic;
  • 2. Historical;
  • 3. Scientific;
  • 4. Multimedia;
  • 5. Client.

Classification according to the degree of distribution:

  • 1. Centralized, or concentrated (English centralized database): a database that is fully supported on one computer.
  • 2. Distributed (eng. distributed database): a database, the components of which are located in various nodes of a computer network in accordance with some criterion.
  • 3. Heterogeneous distributed database: fragments of a distributed database in different network nodes are supported by means of more than one DBMS
  • 4. Homogeneous distributed database: fragments of a distributed database in different network nodes are supported by means of the same DBMS.
  • 5. Fragmented, or partitioned (English partitioned database): the method of data distribution is fragmentation (partitioning, partitioning), vertical or horizontal.
  • 6. Replicated database: The data distribution method is replication.

Classification by database processing technology

  • 1. Centralized base data- stored in the memory of one computer system. If this computing system is a component of a computer network, distributed access to such a database is possible. This way of using databases is often used in PC local area networks;
  • 2. Distributed database- consists of several, possibly intersecting or even duplicating each other, parts stored in different computers of a computer network. Work with such a database is carried out using a distributed database management system (RDBMS).

Classification by way of accessing database data:

  • 1. Databases with local access;
  • 2. databases with remote (network) access - centralized database systems with network access suggest different architectures of such systems:
    • * file server;
    • * client-server.

File server. The architecture of database systems with network access assumes the allocation of one of the network machines as the central one (server, files). A shared centralized database is stored on such a machine. All other machines on the network act as workstations that support user system access to a centralized database. The database files, in accordance with user requests, are transferred to workstations, where they are mainly processed. With a high intensity of access to the same data, the performance of the information system drops. Users can also create local databases on workstations, which they use exclusively.

Client-server. This concept implies that in addition to storing the centralized database, the central machine (database server) should be able to perform the bulk of the data processing. A request for data issued by the client (workstation) generates a search and retrieval of data on the server. The retrieved data (but not the files) is transported across the network from the server to the client. The specificity of the client-server architecture is the use of the SOL query language.

Classification by subject areas of use:

1. Documentary and documentographic databases - contain descriptions of documents. Depending on the content of the description, there are documentographic databases of types BO (only bibliographic description of the document), BC (bibliographic description and keywords) and BKR (bibliographic description, keywords, abstract or abstract). Full-text databases have also appeared that contain the full texts of documents that are actually documentary.

Documentographic systems, as a rule, are built according to a two-loop scheme: the first loop contains a documentographic database and is used for automated document search, the second loop, if necessary, provides the issuance of the full text of the document in the form of a copy of the original source on paper, microfilm or displays the text on the screen from an optical disk ( in some cases from a high-capacity hard disk).

This class includes the following types of databases:

  • - according to published scientific and technical documents;
  • - according to published documents in the field of social sciences;
  • - according to patent documents;
  • - according to reports on R&D, R&D, software;
  • - based on materials of intersectoral exchange (scientific and technical achievements, FPTO, IL, catalog, exhibition information, etc.);
  • - according to standards and other regulatory and technical documentation;
  • - bibliographic databases created in libraries, book publishing and bookselling organizations;
  • - abstract and full-text databases on socio-political information created by the mass media;
  • - database of legislative and legal information;
  • - documentographic documentographic software special types documents;
  • - Database of archival documents. An analysis of the state of affairs with DBD of various types in the country shows that at present, DBD of a documentographic type has received the greatest development, mainly according to published documents in the field of scientific and technical information, social sciences, patent documents, reports on R&D, R&D and other types of documents created by within the framework of the system of scientific and technical information.
  • 2. Product database system - Product information is the main type of technical and economic information. Product data is characterized by:
    • - a huge and often disordered nomenclature (from 10 million to 1 billion items), the OKP alone contains 25 million positions, the drawing economy system has up to 16 million fasteners, there is information on more than 6 million chemical compounds and etc.;
    • - heterogeneity and versatility of application and product description at all stages of its life cycle;
    • - an abundance of properties (features) that characterize individual product groups (up to 200 features);
    • - a variety of areas of application of products, covering all areas of economic activity;
    • - numerous and diverse links between products and other types of information (communication with developers and manufacturers, components, raw materials and resources, technological processes, works and services, condition environment and so on.);
    • - a large number and variety of categories of user groups;
    • - the presence of a large number of different and unrelated systems for classifying and coding products (OKP, ESKD, ETNVT, a harmonized system for describing and coding goods, a bar coding system, industry and local systems, etc.);

The number of users of the product database system reaches tens of thousands (enterprises of industry and Agriculture more than 100 thousand, managing and executive bodies - more than 50 thousand, cooperatives, rental and individual enterprises, etc.).

3. Economic and market information - the creation of databases and banks of economic and market information is an important factor in the functioning of society in the transition to a regulated market economy.

In previous years, economic databases were created and operated as part of the ASPR of the USSR State Planning Committee, the ESIS of the Goskomstat of Russia, the ASFR of the Ministry of Finance of Russia, the ACS of the USSR Gossnab, the OASU of the State Bank, other ministries and departments, territorial management bodies of enterprises and organizations.

Development of information support for regulated market economy will be based on 2 main factors:

  • - reduction in the volume of reporting data submitted by enterprises and organizations to management bodies and state statistics.
  • - a significant increase in the information needs of the Federation Council, enterprises, organizations, the population, bodies of territorial and intersectoral management in socio-economic information.

The main direction in the development of a database of socio-economic and market information is the creation of the following integrated databases:

1) registers and database of accounting and statistical units:

a. passports of socio-economic development of administrative-territorial units (subjects of the federation, self-government authorities)

b. structural economic units in all sectors of the economy, regardless of the type of ownership, in particular, registers of industrial, agricultural, construction enterprises, scientific and design organizations, farms, etc.;

c. building registers.

  • 2) integrated databases for a comprehensive analysis of the state and development of economic sectors.
  • 3) Database of annual balance sheets of activities, enterprises, organizations and their associations, territories and regions.
  • 4) Database of mass censuses, one-time counts and sample surveys.
  • 5) Database on bank transfers.
  • 6) DB on incomes and expenditures of the population, including DB on family budgets.
  • 4. Factual Social Data Bases - social data refers to data about the population and the social environment. Information about the population includes gender, social, medical and any other personal data about individuals, as well as summary data about the population of the country as a whole and its individual territories, and about certain groups of the population: pensioners, tenants, school-age children, women, etc.

Information about the social environment includes data on vacancies, urban planning, urban economy, urban passenger transport, laws, violations public order and etc.

The vast majority of social databases are generated within local government systems. The sources of such data are, as a rule, formalized documents.

The consumers of social databases are the population, government bodies of various levels and spheres, from housing maintenance offices and departments to the Federal Assembly, as well as scientists.

5. Databases of the country's transport systems - the point of designing a database network is to assess its current state.

As part of certain types transport and their automated control systems have been created and are functioning as separate databases and data banks, as well as powerful information systems.

In air transport, there is a Sirena computer network that automates the sale of air tickets. The network is developing and should cover the entire territory of the country.

The Express system has been created in railway transport. Its functions include not only the reservation and sale of railway tickets, but also the issuance of certificates of availability of seats.

On maritime transport within the framework of information retrieval systems "Shipping companies", "Ports", "Ship-repairing plants". "Ships" operate databases for planning and regulating the operation of the fleet and ports, databases on technical condition fleet.

In road transport, within the framework of information retrieval systems, both documentographic databases ASNTI, "Inventions", "Norms", "Standards", "Directives", and factographic databases - "Garo", "Motor transport", "Illiquid assets", "Economy ", "Construction".

  • 6. Reference bases for the population and organizations - in developed countries there is a practice of using the database for information and reference services for the population and institutions. To do this, access to the database of air carriers is organized in order to issue certificates, for example, about the schedule of planes and trains; about addresses and telephones of citizens and organizations; about radio and television programs; about holding exhibitions, etc. In addition, special information and reference databases are created, among which the following types can be distinguished:
    • - encyclopedias and reference books;
    • - indexes of firms, enterprises and organizations;
    • - biographical data ("Who is who");
    • - descriptions of new types of consumer goods;
    • - indexes of government contracts, subsidies, etc.

In our country, this type of database is currently being developed by large reference services that provide similar services, mainly using manual or automated filing cabinets. Some types of reference BND are missing.

7. Resource database system - problems natural resources occupy a special place in the development of any state, determining the degree of its independence and prosperity.

Full, reliable awareness in this matter in order to control, analyze and predict the state of resources is one of the priority needs in the interests of objective, scientifically based assessments of possible ways for the development of society.

The natural resources database system has a number of features, including:

  • - variety of types of database resource objects;
  • - interdependence and interchangeability of various types of resources, and therefore, the need to ensure information interconnection and comparability of information about them;
  • - the existence of a large number of both centralized and regional and departmental sources of information;
  • - various forms of information presentation in information flows (digital, textual, graphic, cartographic, etc.);
  • - a wide range of volumes and temporal parameters of information flows obtained both with the help of sensors and manual input;
  • - the impact of the state of resources and the environment of one region on others.
  • 8. Fact bases and banks of scientific data- modern stage The development of science is characterized by a transition to a qualitatively new level of research, which is determined by the widespread use of methods and means of informatics - the science of the laws and methods of accumulation, processing and transmission of information. In science, the application of methods and means of informatics should not only free the researcher from the routine work of searching for and preparing for the use of known information, but also ensure the implementation of a single line of mathematical technology for solving problems - from the formulation of mathematical models and their complete information support with all necessary data to the formation of software complexes and carrying out the solution of tasks. It is important that this line be continuous and operational, without technological gaps in data processing. In essence, this is determined by the decisive transformation of disparate information, separate data sets and individual programs into a single information and software product, as well as the comprehensive application of modern methods of manipulating such products using computer technology.
  • 9. Databases in the field of culture and art - databases for the automation of lexicography can have a significant economic effect if the LDB is integrated with automated publishing systems, which has abroad mass application. It should be borne in mind that the databases of all three types significantly intersect with each other in terms of data composition, although they have many specific properties. At the same time, many databases are quite large (up to hundreds of thousands and even millions of records), so their creation and maintenance requires significant funds and labor.
  • 10. Linguistic databases- Linguistic databases (LDB) contain data on linguistic units of various levels (from morpheme to text) and various information about these units.

LDBs have three main areas of application:

  • - ensuring the functioning of various automated systems related to text and speech processing (information, expert, training systems, systems for speech analysis, machine translation, etc.);
  • - automation of lexicographic activities for mass and special purposes, i.e. preparation of dictionaries various types(training, translation, normative, explanatory, etc.);
  • - automation of the work of researchers: linguists, language teachers and other philologists.

database storage information

TOPIC 2 CLASSIFICATION OF BND

Issues under study:

1. Database classification

2. DBMS classification

Literature:, chapter 1, chapter 2, chapter 3.

BND are complex systems, and their classification can be made both for the entire BND as a whole, and for each of its components separately. Classification for each component can be carried out according to many different features.

1. Database classification

1) According to the form of information presentation distinguish visual And audio systems, as well as systems multimedia. This classification shows the form in which information is stored in the database and issued to users: in the form of an image (character text, pictures, drawings, photographs, etc.), sound, or it is possible to use different forms of displaying information.

2) By the nature of data organization The database can be divided into unstructured(database in the form of semantic networks), partially structured(e.g. plain text databases or hypertext systems) and structured(require preliminary design and description of the database structure, only after that they can be filled with data). This attribute refers to information presented in symbolic form.

3) Structured database by type of model used data are divided into hierarchical, network, relational, mixed And multi-model. The development of data processing technologies has led to the emergence post-relational, object-relational or hybrid , object-oriented , multidimensional DB.

4) By type of information stored The database is divided into documentaries And lexicographic. Among the documentary bases, there are bibliographic, abstract And full text.

TO lexicographic The databases include various dictionaries, classifiers, headings, etc. They are usually used as a reference together with documentary or factual databases.

IN documentary The unit of storage in the database is a document ( For example, the text of the law or article). The search and issuance of documents takes place according to their content. The simplest search method is based on using descriptors keywords from the problem area that characterize the content of the document. Their collection, extracted from the request, is compared with the document descriptors ("search pattern"). In response to the user's request, either a link to the document or the document itself is issued, in which he can find the information of interest to him.

In systems factual type, the database stores information about the objects of the subject area that are of interest to the user in the form of “facts” ( For example, biographical data on employees, data on the output of products by manufacturers, etc.). In response to the user's request, the information required by him or a message is displayed that the required information is not in the database.

5) By the nature of the organization of data storage and access to them distinguish local And distributed DB.

Local database is a database designed to be used by a single user. Local databases can be created by each user independently, or they can be retrieved from a shared database.

Distributed DB suggest the possibility of simultaneous access of several users to the same information (multi-user, parallel access mode). Physically, different parts of the database can be located on different computers, but logically, from the user's point of view, they must be a single whole.

2. DBMS classification

1) By language of communication DBMS are divided into open(use universal programming languages), closed(own languages ​​of communication with users) and mixed.

2) By function DBMS are divided into informational And operating rooms. Information DBMS allow you to organize the storage of information and access to it. Operational DBMS perform complex processing For example, automatically allow you to get aggregated indicators that are not stored directly in the database, etc.

3) According to the scope of possible application distinguish universal And specialized, domain-specific DBMS(have powerful expressive means to model complex objects).

4) By "power" DBMS are divided into desktop And corporate. characteristic features desktop DBMS are relatively low requirements for technical means, focus on the end user, low cost.

Corporate DBMS provide work in a distributed environment, high performance, support for collaborative work in system design, have advanced administration tools and broader opportunities for maintaining integrity. These systems are complex, expensive, and require significant computing resources.

Table 2.1 - The most popular desktop DBMS

DBMS

Manufacturer

Visual dBase

dBase Inc.

Paradox

corel

Microsoft Access

Microsoft

Microsoft FoxPro

Microsoft

Microsoft Data Engine

Microsoft

Table 2.2 - Server DBMS

DBMS

Manufacturer

Oracle

Oracle Corp.

Microsoft SQL Server

Microsoft

Informix

Informix

Sybase

Sybase

5) By focusing on the predominant category of users DBMS can be distinguished for developers And for end users.

Developer oriented systems , must:

¾ have high-quality compilers;

¾ allow the creation of "alienable" software products;

¾ have advanced debugging tools;

¾ include project documentation tools;

¾ have the ability to create efficient complex systems.

Main requirements presented to end-user systems, are:

¾ convenience of the interface;

¾ high level of language means;

¾ availability of intelligent hint modules;

¾ increased protection against unintentional errors (“foolproof”), etc.

3. Classification of data banks

1) Terms of Service distinguish between free and paid. Paid are divided into commercial and non-profit.

Non-profit database operate on the principle of self-sufficiency and do not set themselves the goal of making a profit (scientific, library or socially significant C&D).

The main purpose of creating commercial BND is to profit from information activities.

2) By form of ownership BND are divided into state and non-state (private, group, personal).

3) By degree of accessibility distinguish between public and restricted access .

4) By subject area coverage BND can be classified in different "sections":

¾ territorial (worldwide, country, city, etc.);

¾ temporary (year, month, since the beginning of the century, etc.);

¾ departmental ;

¾ problem ( thematic ) .

5) By the nature of user interaction BND are divided into active and passive. IN passive bnd the leading role belongs to the user. IN active The system can change behavior on its own.

6) By the nature of the predominant information processing distinguish between OLTP - systems (On - Line Transaction Processing ) – online transaction processing systems(implement a large number of fairly simple queries) and OLAP - systems (On - Line Analytical Processing ) – analytical data processing systems(implement complex analytical data processing) or strategic decision support systems (DSS).

Until the mid-90s of the twentieth century. The database was understood as static databases ( OLTP ). By the mid-90s in the DB class OLTP so much chronological information has accumulated that the volume of the database has increased dramatically, and performance has begun to fall. For example, the work of the dean's office most often requires detailed data on the current academic year. At the same time, retrospective data for previous years is also stored in the database. Such data are needed much less frequently and most often in aggregated form. For example, give out the names of students who received only excellent grades for the last three semesters.

Table 2.3 - Comparison OLTP and OLAP

Characteristic

OLTP

OLAP

Dominant Operations

Data entry, search

Data analysis

The nature of requests

Lots of simple transactions

Complex transactions

Stored data

operational, detailed

Covering a large period of time, aggregated

Kind of activity

Operational, tactical, uncomplicated handling

Analytical, strategic: forecasting, modeling, analysis and identification of relationships, identification of statistical patterns

Data type

Structured

heterogeneous

Data retention period

Up to a year

Up to several decades

Data variability

are changing

Are added

Data ordering

For any field

Chronologically

The amount of information processed

Small

Very big

Processing speed

Medium

Very high

Often and in small portions

Rarely and in very large portions

Separate databases can combine all the data necessary to solve one or more applied problems, or data related to any subject area (for example, finance, students, teachers, etc.). The former are usually called applied databases , and the second - subject databases (corresponding to the objects of the organization, and not to its information applications).

Subject databases allow to provide support for any current and future applications, since the set of their data elements includes sets of data elements of application databases. As a result, subject databases provide the basis for handling informal, changing, and unknown queries and applications (applications for which data requirements cannot be predetermined). Such flexibility and adaptability makes it possible to create fairly stable information systems based on subject databases, i.e. systems where most changes can be made without having to rewrite old applications.

Introduction

Chapter 1. Database Basics

1.1.Classification of databases

1.3 Database description models

1.4. Desktop DBMS Basics

1.5.Requirements and standards for databases

Chapter 2. Working with a Microsoft Access database

2.1. Basics of the desktop database Microsoft Access

2.2. Working with a Microsoft Access database

Conclusion

List of used literature

Introduction

The flows of information circulating in the world that surrounds us are enormous. In

time they tend to increase. Therefore, in any organization

large and small, there is a problem of such an organization of management

data that would provide the most efficient operation. Some

organizations use filing cabinets for this, but most prefer

computerized methods - databases that allow you to effectively store,

structure and organize large amounts of data. And today without bases

data it is impossible to imagine the work of most financial, industrial,

trade and other organizations. If there were no databases, they would simply choke in

information avalanche.

There are many good reasons for converting existing information to a computer basis. Now the cost of storing information in computer files is cheaper than on paper. Databases allow you to store, structure information and retrieve

optimal for the user. This topic is relevant at the present time, because. The use of client/server technologies can save significant funds, and most importantly, time to obtain the necessary information, as well as simplify access and maintenance, since they are based on complex data processing and centralization of their storage. In addition, the computer allows you to store any data formats, text, drawings, handwritten data, photographs, voice recordings, etc.

In order to use such huge amounts of stored information, in addition to developing

system devices, means of data transmission, memory, means are needed

providing a human-computer dialogue that allows the user to enter

or make decisions based on stored data. To provide these features

specialized tools have been created - database management systems (DBMS).

The purpose of this work is to reveal the concept of a database and a database management system, as well as to consider the work of a desktop DBMS using a specific example.

1.1.Classification of databases

A database is an information model of a domain, a collection of interrelated data stored together with minimal redundancy such that it can be used optimally for one or more applications. Data (files) are stored in external memory and are used as input for solving problems.

DBMS is a program that implements centralized management of data stored in the database, access to them, keeping them up to date.

Database management systems can be classified according to the method of establishing relationships between data, the nature of the functions they perform, the scope of application, the number of supported data models, the nature of the language used to communicate with the database, and other parameters.

DBMS classification:

· According to the functions performed, DBMS are divided into operational and informational;

· according to the scope of application, DBMS are divided into universal and problem-oriented;

· according to the language of communication used, DBMSs are divided into closed ones, which have their own independent languages ​​for users to communicate with databases, and open ones, in which a programming language extended by data manipulation language operators is used to communicate with the database;

· According to the number of supported levels of data models, DBMS are divided into one-, two-, three-level systems;

· according to the method of establishing links between data, relational, hierarchical and network databases are distinguished;

· according to the method of organizing data storage and performing processing functions, databases are divided into centralized and distributed.

Centralized database systems with network access assume two main architectures - file-server or client-server.

File server architecture. Assumes the allocation of one of the network machines as a central one (the main file server), where a shared centralized database is stored. All other machines act as workstations. Database files in accordance with user requests are transferred to workstations, where they are mainly processed. With a high intensity of access to the same data, the performance of the information system drops.

Client-server architecture. This model of interaction between computers on a network has actually become the standard for modern DBMS. Each of the computers connected to the network and making up this architecture plays its own role: the server owns and manages the information resources of the system, the client has the opportunity to use them. In addition to storing the centralized database, the database server handles the bulk of the data processing. A request for data issued by the client (workstation) generates a search and retrieval of data on the server. The extracted data is transported across the network from the server to the client. The specificity of the client-server architecture is the use of the SQL query language.

The database server is a DBMS that simultaneously processes requests from all workstations. As a rule, the client and the server are geographically separated from each other, in which case they form a distributed data processing system.

1.2. DBMS functionality

The characteristics of a DBMS are:

performance;

Ensuring data integrity at the database level;

Ensuring data security;

Ability to work in multi-user environments;

Possibility to import and export data;

Providing access to data using the SQL language;

Possibility to make requests

Availability of tools for developing application programs.

DBMS performance is evaluated:

The time it takes to complete requests

the speed of information retrieval;

time of importing databases from other formats;

the speed of operations (such as updating, inserting, deleting);

time of report generation and other indicators.

Data security is achieved:

Encryption of application programs;

data encryption;

protection of data with a password;

· restriction of access to the database (to a table, to a dictionary, etc.).

Ensuring Data Integrity implies the existence of means to ensure that the information in the database always remains correct and complete. Data integrity must be maintained regardless of how the data is entered into memory (online, through import, or through special program). Currently used DBMS have the means to ensure data integrity and reliable security.

The database management system manages data in external memory, provides reliable data storage and support for the appropriate database languages. important function The DBMS is the function of managing RAM buffers. Usually DBMS work with databases of large sizes, often exceeding the size of the computer's RAM. Developed DBMS maintain their own set of RAM buffers with their own discipline for replacing them.

The most widely used database management systems are Microsoft Access and Oracle.

The stages of work in the DBMS are:

creation of the database structure, i.e. definition of the list of fields that make up each record of the table, types and sizes of fields (numeric, text, logical, etc.), definition of key fields to provide the necessary links between data and tables;

Entering and editing data in database tables using the standard form presented by default in the form of a table and using screen forms specially created by the user;

processing of data contained in tables based on queries and on the basis of the program;

· output of information from the computer with the use of reports and without the use of reports.

These stages of work are implemented using various commands.

The centralized database provides ease of management, improved use of field data for remote queries, higher processing concurrency, lower processing costs.

A distributed database involves the storage and execution of data management functions in several nodes and the transfer of data between these nodes in the process of executing queries. In such a database, not only its various tables can be stored on different computers, but also different fragments of one table. At the same time, it does not matter for the user how the data storage is organized, he works with such a database as with a centralized one.

1.3. Database description models

There are three types of database description models - hierarchical, network and relational, the main difference between which is the nature of the description of the relationship and interaction between objects and attributes of the database.

Hierarchical model involves the use of tree structures consisting of a certain number of levels to describe the database. A "tree" is a hierarchy of elements called nodes. Elements are understood as a list, a collection, a set of attributes, elements that describe objects.

An example of a simple hierarchical structure is the administrative structure of a higher education institution, the elements of which are: "University - Faculty - Group". At each level of the hierarchy of a given structure, different attributes can be used. For example, the attributes of the third level can be: specialization of the group, number of members, surname of the head of the group, and others.

In this model, there is a root node, or simply a root - "University", which is at the highest level of the hierarchy, and therefore has no nodes above it. Each node of the model has only one source, located in relation to it by more than high level, and at subsequent levels of classification, it may have one, two or more nodes, or not have them at all.

Hierarchy principles:

· the hierarchy always starts from the root vertex (or main node);

The source node from which the tree is built is called the root node or simply the root, and one tree can have only one root;

A node may contain one or more attributes that describe the object in it;

generated nodes can be embedded in the "tree" both in the horizontal direction and in the vertical;

· Child nodes can only be accessed through the source node, so there is only one access path to each node.

The advantage of the model is the simplicity of its construction, the ease of understanding the essence of the hierarchy principle, the availability of industrial DBMS that support this model. The disadvantage is the complexity of the operations to include information about new database objects in the hierarchy and remove obsolete information.

network model describes elementary data and relationships between them in the form of a directed network. These are such relations between objects, when each child element has more than one source element and can be associated with any other element of the structure. For example, in the management structure of an educational institution, the child element "Student" may have not one, but two initial elements: "Student - Study group”, and “Student – ​​Dorm Room”.

Network structures can be multi-level and have varying degrees of complexity. A schema that has at least one many-to-many relationship and that requires complex methods to implement is a complex schema.

The database described by the network model consists of areas, each of which consists of records, and the latter, in turn, consist of fields. The disadvantage of the network model is its complexity, the possibility of loss of data independence when the database is reorganized. As new users, new applications, and new types of queries come in, the database grows and can break the logical view of the data.

relational model is based on the concept of "relationship", and its data are formed in the form of tables. A relation is a two-dimensional table that has its own name, in which the minimum object of actions that preserves its structure is a table row (tuple) consisting of table cells - fields.

Each column of the table corresponds to only one component of this relationship. From a logical point of view, a relational database is represented by a set of two-dimensional tables of various subject content.

Depending on the content, relational database relations are objective and coherent. Objective relations store data about any one object, an instance of an entity. In them, one of the attributes uniquely identifies the object and is called the key of the relation or the primary attribute (for convenience, it is written in the first column of the table). The remaining attributes are functionally dependent on this key. Objectively, there can be no duplicate objects, and this is the main limitation of a relational database. A connected relation stores the keys of several object relations, according to which links are established between them.

If the set of database attributes is not fixed in advance, then various options for grouping them are possible, however, regardless of the chosen method, uniform requirements must be observed. In particular, if the database contains many relationships, then they should have a minimum redundancy in the representation of information; the attributes included in the database must ensure the performance of mass calculations; when new attributes are added to the database, rebuilding relationship sets should be minimal.

The advantages of the relational model include: ease of construction, accessibility of understanding, the ability to operate a database without knowing the methods and methods of its construction, data independence, structure flexibility, and others. The disadvantages of the model are: low performance compared to the hierarchical and network models, complexity software, redundancy.

1.4. Desktop DBMS

Desktop DBMS differ in that they use a computing model with a network and a file server (architecture "file server"). The increase in the complexity of tasks, the emergence of personal computers and local area networks were the prerequisites for the emergence of a new "file server" architecture. This network-accessible database architecture assumes that one of the computers on the network is designated as a dedicated server that will store the database files. In accordance with user requests, files from the file server are transferred to user workstations, where the main part of data processing is carried out. The central server basically performs only the role of a file storage, not participating in the processing of the data itself.

The work is structured as follows:

The database in the form of a set of files is located on the hard disk of a dedicated computer (file server). Exists the local network, consisting of client computers, on each of which a DBMS and an application for working with the database are installed. On each of the client computers, users have the ability to run the application. Using the user interface provided by the application, it initiates a request to the database to fetch/update information.

All calls to the database go through the DBMS, which encapsulates within itself all the information about the physical structure of the database located on the file server. The DBMS initiates calls to data located on the file server, as a result of which part of the database files is copied to the client computer and processed, which ensures the execution of user requests (the necessary operations are performed on the data). If necessary (in case of data changes), the data is sent back to the file server in order to update the database. The DBMS returns the result to the application. The application, using the user interface, displays the result of query execution. As part of the "file-server" architecture, the first versions of the popular so-called. desktop DBMS such as dBase and Microsoft Access.

The following main disadvantages of this architecture are indicated: when many users access the same data at the same time, the performance drops sharply, because it is necessary to wait until the user working with the data completes his work. Otherwise, changes made by some users may be overwritten by changes made by other users.

To date, more than two dozen data formats for desktop DBMS are known, but the most popular, based on the number of copies sold, are dBase, Paradox, FoxPro, and Access. Of the recently appeared DBMS, it should also be noted Microsoft Data Engine - essentially a server DBMS, which is<облегченную>version of Microsoft SQL Server, but intended primarily for desktop and small workgroup use.

Manufacturer

http://www.dbase2000.com/

http://www.corel.com/

Microsoft Access 2000

http://www.microsoft.com/

Microsoft FoxPro

http://www.microsoft.com/

Microsoft Visual FoxPro

http://www.microsoft.com/

Microsoft Visual FoxPro

http://www.microsoft.com/

Microsoft Data Engine

http://www.microsoft.com/

1.5.Requirements and standards for databases

The following basic requirements are imposed on modern databases, and, consequently, on the DBMS on which they are built:

· High performance (short response time to a request).

To create a new database using the wizard, follow these steps:

Execute the command File [Create]

In the "Create" dialog box that opens, select the "Databases" shortcut. A list of databases offered by the wizard will appear on the screen. This list is very large and can reach several dozen different options that can be immediately used or will serve as the basis for building other databases. For example, “Work Orders”, “Invoices”, “Contacts”, “Events”, … etc.

Select from the list the sample database that suits you and launch the database creation wizard by clicking the OK button.

In the “New database file” dialog box that opens, from the Folder drop-down list, select the folder where you want to save the created database, and in the File name field, enter its name. Then click the Create button.

In the next dialog box, the wizard tells you what information the database it will create will contain. At the bottom of this dialog box are the following buttons:

Cancel - terminates the wizard;

Back - allows you to return to the previous step in the wizard;

Done - starts the database creation wizard with the selected options, and before pressing this button, the information that will be stored in the database is displayed.

The dialog box that opens contains two lists. The first one is a list of database tables and the second one is a list of fields in the selected table. This list marks the fields that will be included in the table. Usually almost all fields of tables are marked, with the exception of fields that are used quite rarely. By checking or unchecking the fields, you can select the table fields. After selecting the table fields, click Next.

In the next dialog window, select a screen design type from the proposed samples and click the Next button (in this case, the screen offers you the opportunity to preview the screen design types that you can select in the right window of the dialog window).

At the next step of the wizard, you can define the type of reports generated for the database. After selecting the view that suits you, click the Next button (here you are also invited to familiarize yourself with the possible options, which can also be iterated).

The database creation wizard dialog box that opens next allows you to specify its title and image (for example, the company's trademark) that will appear in all reports. If you choose to use a picture, check the Yes box. In this case, the Picture button becomes available, clicking on which opens the “Select picture” dialog box for selecting a file with a picture that you previously created. Click the Next button to make further settings.

Clicking the Finish button last window dialog, you start the wizard to build a database with the specified parameters. Using the Back button, you can return to any of the previous steps and change the database settings. You can click the Finish button in any dialog box of the wizard, refusing to further set additional parameters. In this case, the wizard uses the default settings in its work.

After clicking the Finish button, the wizard proceeds to create a database consisting of tables with fields you specify, simple forms for entering and viewing information, and simple reports. After completing the process of creating a database, you can immediately use the finished database: enter data into tables, view them and print them.

If the options of the proposed databases do not suit you, then you can create an empty database and add tables, queries, forms and reports to it.

So, you start creating database tables, into which information will be subsequently entered. In the future, the data in the table can be supplemented with new data, edited or excluded from the table. You can view the data in tables or arrange it according to some criteria. The information contained in the tables can be used to generate reports. In addition, you can give a graphical interpretation of the information contained in the database. You will get acquainted with the solution of these problems in the following chapters.

Creating a table in MS Access is carried out in the database window. Consider the sequence of your actions when creating a table in a new database:

Open the window of the database you created and go to the “Tables” tab.

Click the New button in the database window.

The “New Table” dialog box will open, in the right part of which there is a list of options for further work:

Table mode - allows you to create a new table in table mode;

Constructor - allows you to create a new table in the table designer;

Table Wizard - allows you to create a new table using the wizard;

Import tables - allows you to import tables from an external file into the current database;

Link to tables - allows you to create tables linked to tables from external files.

Select from this table the option for creating a table that suits you and click OK.

Create the table structure using the tool of your choice. You can create a table using the wizard and using the designer.

To associate a table with the information it contains, each table is given a name. Specify a name for the table in the Save dialog box and click OK.

When naming a table, like a database name, you don't have to be limited to eight characters. The table name, like the names of other database objects, is stored in the database itself.

The field name is entered in the input field of the field name column. When naming fields, you must follow the following rules:

The field name can contain up to 64 characters, but you should not abuse this possibility by setting names that are too long;

The field name can contain letters, numbers, spaces, and special characters, except for the dot (.), exclamation mark (!), brackets (), and some control characters (with ASCII codes 0-31);

The field name cannot start with a space;

Two fields in the same table cannot have the same name;

Non-compliance with these rules is monitored by means of the MS Access DBMS, but in some cases this can lead to hard-to-detect errors, so it is recommended that you independently monitor compliance with the above rules in practical work.

It is advisable to try to use names that are short in order to make them easier to identify when viewing tables.

Text;

Numerical;

Monetary;

Counter;

Date/Time;

Logical;

MEMO field:

OLE object field;

Substitution master.

Text fields can contain letters, numbers, and special characters. The maximum field width is 255 characters.

To change the field width, in the Field size line of the “Field properties” section, specify a number that determines the field width (from 1 to 255).

Each of the data types has its own properties, which are displayed in the Field Properties section of the designer window.

After we have looked at the data types in Access and the individual properties of the table fields, we can begin to create the structure of the table. Let's look at creating a table structure using the example of creating the Orders table in the Northwind database that comes with Access. This table is already in the Northwind database, but it will be very useful to review the process of creating this table. In order not to break the structure of the Northwind database, first create a sample database and open its window.

In the Table Design window, in the Field Name column, enter OrderCode.

Press the Tab or Enter key to move to the Data Type column. At the same time, note that information appears in the “Field Properties” section at the bottom of the dialog box.

The Data Type column now has the value Text. Click on the expand button on the right side of the rectangle and you will see a list containing all data types. From this list, use the mouse or use the up and down keys to select the Counter value and press the Tab key to move to the Description column. The Description column is the explanation you give to your fields. When you work with this table in the future, this description will appear at the bottom of the MS Access screen whenever you enter the OrderCode field and remind you of the purpose of this field.

Enter explanatory text in the Description column and press Tab or Enter to move on to entering information about the next field. Enter a description for all fields in the table in the same way.

Create a table in Datasheet view

We have considered two ways to create tables, now we are moving on to the third. Each of the methods has its own advantages and disadvantages. But we are sure that this method of creating a table will amaze you with its simplicity, clarity and you will like it very much. No wonder it is located first in the list of methods for creating tables in the New Table dialog box. You will probably use it most of the time.

Below is the sequence of actions that you have to perform:

Go to the "Tables" tab of the database window and click the Create button.

In the New Table dialog box, select Table Mode from the list of options and click OK. As a result of these actions, the “Table” dialog box will open, containing the table created by default. This table has 20 columns and 30 rows, which is enough to get you started. After saving this table, you can of course add as many rows and columns as you need.

The names of the table fields are defined by default, but they are unlikely to meet your requirements. MS Access makes it very easy to give fields new names. To do this, double-click on the selection area of ​​the first field (whose heading contains Field 1). The field name is highlighted and a blinking cursor appears. Enter the name of the first field and press the Tab key. Similarly, enter the rest of your table's field names in the following columns.

Now fill in a few lines of your table, entering the information in the form in which it will be entered in the future. Try to write everything in the same style (for example, if you wrote down the first date as 10/14/09, then don't write the next one as November 3, 2009). If MS Access sets the wrong data type, you can change it, but it's better to enter everything correctly at once.

Save the table by executing the File/Save Layout command or by clicking the Save button on the toolbar. In the Save dialog box that opens, give the table a name and click OK.

When prompted to create a primary key for the table, click Yes, and MS Access will create the table by deleting extra rows and columns.

Now make sure that Access has selected the correct data types for each field. To do this, go to the table designer window by executing the View/Table Designer command. If something does not suit you in the structure of the table, make the necessary changes.

In the previous chapter we looked at the use of filters, and in this chapter we will look at more powerful remedy data selections - queries.

In practice, it is often required to select from the source table a part of the records that meet certain criteria, and to order the selection. Criteria may be determined by a combination of a number of conditions. For example, you need to select records about suppliers of a certain product from Minsk and sort them alphabetically by company name. To solve such problems, the Query Wizard and the Query Builder are designed, with which you can:

Form complex criteria for selecting records from one or more tables;

Specify the fields displayed for the selected records;

Perform calculations using the selected data.

What is a Sample Request?

Previously, we considered the general provisions related to databases, where we noted that one of the main purposes of databases is quick search information and answers to a variety of questions. Questions formulated in relation to the database are called queries. MS Access uses the query builder and the MS Access SELECT statement to form queries.

What is a "request by pattern"? A pattern query is an interactive tool for selecting data from one or more tables. When forming a query, you need to specify the criteria for selecting records in the source table. In this case, instead of typing sentences in a special language, you should simply fill out the request form, which is located in the request designer window. The method of generating a request by filling out a form is easy to learn and understand. It contributes to the effective use of MS Access capabilities by users who have even minimal skills in working with the application or who do not have it at all.

All the necessary information is in the Clients table of the Northwind database. Therefore, to create a request, follow these steps:

In the database window, go to the Queries tab and click the Create button.

The order of the fields in the query form determines the order in which they appear in the resulting table. To change the location of a field in this list, do the following:

Place the mouse pointer on the column selection area, which is located directly above the field name. When the pointer changes to an arrow, click the button to highlight the column.

Click and hold the mouse button in this position. A rectangle will appear at the end of the pointer.

Move the column in the required direction. A thick vertical line will show its current position.

Release the button when the thick vertical line is in the desired location. The field will be moved to the new location.

Moving a query form column. Sometimes it is not immediately possible to select a column for moving it. Make sure you click on the column selection area (the little rectangle that sits right above the field name). This is the only place where you can grab a wrapping column. To transfer a field to the request form, double-click on the field name

Read Auto Report.

A ready-to-use report will appear on the screen in a matter of seconds. This report includes all fields in the table. Their names are arranged vertically in volume the same order as they appear in the table. To the right of the name of each field, its value in the table is displayed.

Conclusion

In this term paper the concept of a database and a database management system was considered, with the help of which the centralized management of data stored in the database, access to them, and keeping them up to date is implemented. So, a database is a collection of interrelated data stored together in one or more computer files. And also work with one of the popular DBMS Microsoft Access was considered.

The Microsoft Access database provides necessary funds to work with databases for an inexperienced user, allowing him to easily and easily create databases, enter information into them, process queries and generate reports. Unfortunately, the built-in help system does not clearly explain the novice user how to work, so there is a need for a manual.


Marchenko A.P. Microsoft Access: Short Course. - St. Petersburg: Peter, 2005. - p. 56

Lazarev I.P.. “Microsoft Access for Dummies”. St. Petersburg - Peter, 2004. - p.139.

Lazarev I.P.. “Microsoft Access for dummies”. St. Petersburg - Peter, 2004. - p. 196.

Marchenko A.P. Microsoft Access: A Short Course. - St. Petersburg: Peter, 2005. - 239.

Loading...