Data Engineering Interview Questions

Displaying 1 - 8 of 8

What is an E-R Model?

04/24/2021 - 17:01 by devraj

An E-R model maps real-life things as entities and their properties as attributes. Every entity in the E-R model is related to some other entity. The relation between theses entities defines how each entity is related to other entities. This forms an E-R model. An E-R model can be mapped to a relational model using foreign keys. Entities are tables and attributes becomes columns of that table. All these tables can, then, be related using foreign keys.

What is a full functional dependency?

04/24/2021 - 17:00 by devraj

If a non-prime attribute of a table depends on the complete composite key, there is a full functional dependency of the non-prime attribute on the composite key of the table. Suppose, two attributes together form the composite key of the table. If both of these attributes are required to determine the value of a non-prime attribute, the non-prime attribute has a full functional dependency on the composite key.

What is the ACID Property of a relational DBMS?

04/24/2021 - 16:59 by devraj
  1. Atomicity: This property ensures that either the transaction happens completely or doesn't happen at all. Every transaction causes some data values in a table to either delete or update. It may also add a new row to the table. It is necessary that when a transaction happens, either all data values or rows are updated or none of the values is changed.
  2. Consistency: This property ensures that the data is consistent in every table. Data before and after a transaction is made must be consistent.
  3. Isolation: If two transactions are happening at the same place, either they should be independent of each other or one of these transactions must happen first.
  4. Durability: Every transaction must be durable. This means that if a transaction occurs, then the changes made by this transaction to the database remain even in the event of a system failure.

What are the different anomalies in a database?

04/24/2021 - 16:58 by devraj

Insertion anomaly: It occurs when you insert a data record as a row but the information is not available for all the columns, causing null values in a row.

Updation anomaly: It occurs when you update information in one row of a table, causing the previous information to be deleted, as there is no other copy of that data.

Deletion anomaly: It occurs when deletion of one row to remove data from some columns causes the data in other columns to be deleted as well. This occurs when data about multiple entities is stored in the same table. If information about one entity is deleted, then the information about another entity in the same row gets deleted.

Define various keys in a database.

04/24/2021 - 16:56 by devraj

Super keys: These are combinations of all possible attributes that can uniquely identify each row of a table. 

Candidate keys: These are the super keys that contain only the necessary attributes to identify each row of a table. If a table contains numerous candidate keys, then the database designer chooses one of those keys to uniquely identify each row. This key is the primary key of that table. Also, if a primary key contains more than one attribute, then it is a composite key.

Foreign keys: These are used to implement relations between two tables. One of the tables thus related contains the foreign key, which refers to the primary key of the other table. The values present in the foreign key column of one table must be present in the primary key column of the other table.

Explain the three levels of database architecture.

04/24/2021 - 16:54 by devraj

Three levels of database architecture arre:

  1. Internal level: This level defines how the data is stored on physical devices.
  2. Logical level: This level defines how the data is stored in the database.
  3. View level: This level includes applications that are used to access the data in the database.

What is data normalisation?

04/24/2021 - 16:51 by devraj

Data normalisation refers to organising data into tables such that no value is repeated in a table. The following three anomalies arise if data is not organised properly: insertion, deletion and updation. These anomalies either lead to the deletion of the required data or the creation of null values in various fields when data is inserted, deleted or updated into the database. To handle these anomalies, data normalisation is used. Data normalisation puts only the required information in each table such that there is no repetition of record in it. For example, if customer information, such as name, phone number and address, gets repeated in the ‘order’ table each time an order is placed, then, according to data normalisation, the customer information must be stored in a different table. The order table must contain only one foreign key that refers to the primary key of the customer table.

Data normalisation involves the following three steps:

1. First normal form (1NF): If all the fields in a table contain only one value, and each row of the table can be uniquely identified by a primary key, then that table is in 1NF.
2. Second normal form (2NF): If a table is in 1NF and if all the non-prime attributes are fully functionally dependent on the primary key of the table, then that table is in 2NF.
3. Third normal form (3NF): If a table is in 2NF and if it contains no transitive dependencies, then that table is in 3NF.

What is a relational database management system?

04/24/2021 - 16:50 by devraj

A DBMS that is based on a relational model is called a relational database management system. In this system, data is stored in tables. Each table represents only one entity and the columns of a table represent the properties of that entity. Also, each row contains a single record of that entity. In a relational model, two tables are related using the foreign key values. Relational databases support transaction systems, where each transaction must either happen or not happen at all. This is because of the ACID properties of relational databases.