Users and organizations are generating new data at an unprecedented level. Many organizations are utilizing a dozen or more different and non-integrated systems across their enterprise. They are using a mixture of both on-premise and cloud platforms, storing data in a variety of formats, from a standard spreadsheet to databases to even physical paper.
As the amount of data increases exponentially, the amount of meaningful information is not following at the same rate. Data is not useful if it cannot be organized and understood by the target audience. Let us teach you the basics and beyond!
What is a Database?
This brings us to databases! Databases have been around for a long time and there are many different types of databases. The most common type of database is a relational database.
A relational database stores data in tables and columns. The rows are the records, and the columns are the fields in the record. Inside a database table, a row represents a single record, and a column represents an attribute of that record. Typically, data in a single table represents a single relation. A database in its entirety can store anything from sales transactions to personal photos. If the data can be defined, it can be stored in a relational database.
Almost all reporting projects will utilize at least one if not multiple databases in the project. Usually, you’re exporting data from one location and importing it in another. A database can be a standalone application or integrated into something else.
You will rarely work directly with a database, but rather through a mix of middleware that helps keep the database running correctly. A database can run on a standalone server or can be integrated into an application.
Almost everything, in some capacity, is using a database. In order to understand databases, you must first understand the basics of Databases: Keys, relationships, and normalization.
Keys. Are the unique identifier for a record. Keys are all around you and help you locate something specific. For example, invoice number, customer id, booking number, etc. are all keys. Keys are used to establish relationships between tables and allow you to access data from multiple tables without having to join the tables.
Relationships. Are the way we relate multiple data tables together. We most often do it through the mapping of keys. Keys are created when two or more tables share a common key field(s). The relationship is established by creating a foreign key in one table that references the primary key of another table.
Normalization is the process of organizing data into tables so that it can be easily accessed and used by the application. It is important to remember that databases are not just for large organizations or enterprises. Databases can be used by anyone who needs to store data in an easily accessible and useable format.
- Standardize your keys throughout your systems. For example, utilize a single ID to represent your clients or vendors. The ID will be the same regardless of what system you are using.
- Use Numbers. Generally, its best to use integers for your keys. Avoid using letters if possible. Numbers are usually faster for systems to process then text.
- Consistent. Avoid keys which could change. For example, email addresses or last names. These values can change from time to time and should be avoided.
- Simple. Keep your keys and IDs as simple as possible. Don’t make them any longer or complicated as needed.
- One System to Rule Them All. Pick a master system of record which will be the primary record keeper for your organization. All other systems will be ancillary to the main system and will report to and get information from this system.
Database Options – Picking the right tools for the job
Choosing the right database is enormously important as it is difficult to change later. There are a number of different database types to choose from and each has its own set of benefits and drawbacks. If unfamiliar, you should consult with a technology professional when making the decision.
Types of Databases
When to Use a Database?
Relational databases are the most common type of databases. They store data in tables that are related to one another through keys. This allows for easy access and reporting on the data.
NoSQL databases are an increasingly popular choice as they are designed to handle large scale, unstructured data. They do not require the establishment of relationships between tables and can store data in any format.
Multidimensional databases are well suited for handling data that is organized in a hierarchical manner. This type of database is often used by businesses for financial analysis and reporting.
Once you have selected a database, you will need to select the tools that you will use to access and work with the data.
There are a number of different software applications that can be used for this purpose, including Microsoft Access, Microsoft SQL Server, and Oracle. Each application has its own strengths and weaknesses. You should consult with your technology professional if you are unfamiliar with these options when making the decision about which tool will be the best for your purposes.
- MS Access. Great personal database, but it should not be used for large datasets or multiple simultaneous users.
- MySQL. Great all-around database, free, open source, pretty good functionality and speed.
- MS SQL. Great all-around database. Easy to work with if you are already familiar with other Microsoft applications.
- Oracle / Postgres. The databases of choice for large datasets that need outstanding performance, speed, and security.
In general, a database can be used to store anything from simple records to large complicated datasets. However, databases are not always the best solution for every problem. There are a few key scenarios when you should consider using a database:
- When data needs to be accessed from multiple locations or systems.
- When data needs to be normalized into tables for easy access and use by an application.
- When large amounts of diverse data needs to be stored
- You need to have several users interacting with the data at the same time.
- You have mixed-use cases on how you want to store, retrieve, and report on the data.
- You will be accessing the data using a mix of technologies.
- You have significant security requirements.
Transactional vs. Analytical Databases
A transactional database is used for real-time transactions such as customer purchases, system logs, or posts to the general ledger. It is the database that is responsible for the day-to-day operations.
An analytical database is used for data warehousing, reporting and complex analytics. The data in an analytical database is typically extracted from one or more transactional databases. Depending on the size and complexity of the data these procedures can interfere or interrupt real-time transactions.
An analytic database might contain centralized date coming from several transactions databases inside an organization.
The best practice is to usually utilize independent infrastructure for transactional and analytic databases.
Understanding ACID for Databases
Properties that guarantee that database transactions are processed reliably can be remembered by the acronym ACID, Atomicity, Consistancy, Isolation and Durability.
- Atomicity requires that each transaction is “all or nothing”, if one part of the transaction fails, the entire transaction fails, and the database state is left unchanged.
- Consistency requires ensures that any transaction will bring the database from one valid state to another.
- Isolation ensures that the concurrent execution of transactions results in a system state that would be obtained if transactions were executed serially.
- Durability ensures that once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors.
Hosted Cloud Databases – Where do you want to store your data?
After you have selected your database platform, you need to figure out where you are going to store your data. This decision is largely based on the size of your data and your budget. There are a number of different storage options to choose from, including cloud-based storage and local storage. Again, each option has its own set of benefits and drawbacks. You should consult with your technology
In conclusion, Databases can be a powerful tool to help you get your data organized and ready to work for you. But, it is important to select the right type of database for your needs, choose the correct tools to access and work with the data, and carefully plan and execute any data integration between databases. By following these steps, you can be sure that your databases will help you achieve your business goals.