Craig S. Mullins
               
Database Performance Management

Return to Home Page


As published in eServer Magazine, Mainframe Edition, September 2002

 



 

A New Day for the Database
By Craig S. Mullins

Challenges include increased complexity and the need to integrate Web and wireless functionality

 

Database management systems (DBMSs) are the heart of the modern IT infrastructure. Enterprises expect the DBMS to be reliable, secure and available. In short, the DBMS is what makes modern applications sing.

Modern database systems, however, are becoming increasingly large and complex. These days, databases store not only characters, numbers and dates, but complex unstructured data types like audio, video, images and compound documents. Then there's the rapid and sometimes out-of-control growth of data warehouses spurred by the business intelligence needs of many enterprises.

Besides housing data, databases store processes that act upon that data. Stored procedures, triggers and user-defined functions managed by the DBMS place new requirements on database administrators (DBAs). In addition, more than ever before, databases are being placed on more diverse platforms -- from mainframes to workstations, and even personal digital assistants (PDAs). Finally, databases are increasingly being connected to the Internet to enable e-business applications. All of these trends and new technologies further complicate database management and impact the jobs of DBAs.

The Internet and Databases

In the world of e-business, full functionality and non-stop availability are expected. Regardless of the time or date, customers expect an e-business to be up and running, available to serve their needs. Why should customers wait for your site to come back up when numerous competitors are available just a mouse-click away?

If an e-business isn't prepared to serve customers 24 hours a day, it risks losing business to more accessible competitors. System outages, whether planned (for maintenance and tuning) or unplanned (due to hardware failure, bugs or viruses), are the enemy of the successful e-business.

Many administrative tasks for Web-enabled databases differ from traditional DBA tasks. Let’s refer to these expanded DBA duties for e-business as eDBA.

The eDBA's first duty is minimizing downtime. Analysts estimate that as much as 80 percent of application downtime is due to software failures and human error. Downtime is caused by problems such as improperly entered transactions, improperly timed batch runs or running a program using the wrong input files or parameters. DBAs can reverse the effects of application failures using the database log and high-speed transaction recovery solutions, which enable recovery from application failures without taking a database outage at all – thereby bolstering the database for 24x7 e-business.

For the eDBA, the concept of downtime requires a shift in thinking. Though it's important to plan for recovery from unplanned outages, remember that planned outages occur more frequently and therefore can have a greater impact on overall availability than unplanned outages. Examples of planned outages include software upgrades, database changes and regularly scheduled maintenance tasks like reorganization.

Of course, the best way to reduce downtime is to avoid it. Modern database systems are gaining features to help avoid downtime altogether by enabling change, optimization and management tasks to be performed while databases remain online. One example is concurrent database reorganization. Utilities can reorganize data to a mirror copy, swapping the copies when the reorganization process completes. Because the database can stay online during the process, downtime is eliminated. These techniques require more disk space, but won't disrupt an online business. Emerging technologies are becoming available to perform other tasks such as backing up and loading databases or changing system parameters while the databases remain online for read and update.

Another way to minimize downtime is to automate routine maintenance tasks. For example, changing the structure of a table can be an arduous task. The structure of relational databases can be modified using the ALTER statement, but ALTER cannot be used for every type of change. Database change-management tools allow you to make desired changes to a relational database using an online interface. The tool automatically generates scripts that understand the correct way to invoke database changes. When errors are avoided using automation, downtime is diminished, resulting in greater online availability.

Sometimes downtime is unavoidable. In these cases, the DBMS tries to minimize downtime by staying current with DBMS versions that feature better-optimized code and provide new data-management techniques.

Java and XML

While downtime is the most important issue faced when coupling databases to the Internet, eDBAs also need to be prepared to work with key Internet-related technologies like Java* and XML.

As applications are moved to the Web, Java gains in popularity. Two methods of accessing relational data from a Java program are JDBC and embedded static SQL for Java (SQLJ). JDBC is an API that enables Java to access relational databases. Similar to ODBC, JDBC consists of a set of classes and interfaces that can be used to access relational data. JDBC provides dynamic SQL access to relational databases. Using JDBC, theoretically at least, you should be able to write an application for one platform, say Oracle9i on a Sun* Solaris* platform, and then deploy it on another platform, say DB2* UBD on a zSeries server. Simply by using the correct JDBC drivers for the database platform, the application should be portable. With SQLJ, a translator must process the Java program. For the DB2 literate, this is just like precompiling a COBOL program. The translator strips the SQL from the Java code so it can be optimized into a database request module. It also adds Java code to the Java program, replacing the SQL calls. Now the entire program can be compiled into bytecodes, and a bind can be run to create a package for the SQL.

So, should you use JDBC or SQLJ? Of course, the answer is "it depends!" SQLJ potentially can enhance performance using static SQL, which can be important for Java. SQLJ is similar to embedded SQL programs; JDBC is similar to call-level interface programs. The familiarity of your developers with either approach could make one method the best choice. So, before your shop moves forward with Java development, be sure that your DBAs have been trained in Java and understand the differences between JDBC and SQLJ.

Another significant trend in the database market is XML. Like HTML, XML is based on standard generalized markup language (SGML), which allows documents to be self-describing through the specification of tag sets and the structural relationships between the tags. HTML is a small, specifically defined set of tags and attributes, enabling users to bypass the self-describing aspect for a document. XML, on the other hand, retains the key SGML advantage of self-description, while avoiding the complexity of full-blown SGML.
 

So what? Well, XML allows users to define tags that describe the data in the document. With this capability, users can describe the structure and nature of the data in the document. In essence, the document becomes self-describing. XML's simple syntax is intended to make it easy to process by machine while remaining understandable to users. In short, because XML documents are self-describing, the data is easily understood and used by the receiver. Thus, it's increasingly being used to transfer data.

A typical use of XML is to enable the DBMS to parse XML and move it in and out of an SQL database. Another trend is native XML database systems that operate on XML data without being transformed into a relational structure. XQuery support is being added to database systems to enable users to query native XML within the database.

Procedural DBA

Until recently, DBMSs solely stored, managed and accessed data. Although these core capabilities are still required, modern DBMS products also integrate procedural logic like triggers, stored procedures and user-defined functions (UDFs).

Procedural database logic offers the primary benefit of promoting reusability. Instead of replicating code within multiple application programs, code can reside in a single place: the database server. This is preferable to cannibalizing sections of program code for each new application that must be developed. An additional benefit is increased consistency. If every user and every database activity with the same requirements is assured of using the database logic instead of multiple, replicated code segments, then the organization can be assured that everyone is running consistent code.

Although the functionality provided by triggers, stored procedures and UDFs is unquestionably useful, these objects pose major administration challenges. DBAs must administer, design, manage and sometimes even code these objects, even though code testing and debugging isn't a typical role for DBAs. Database experts cannot be expected to debug code written in C or COBOL. Companies may rely on DBAs to be the SQL experts in the company, but often they're not experts – at least not “programming” experts. Simply because the DBA knows the best way to create a physical database design doesn't mean he'll know how to best access that data.
 

A new type of DBA – a procedural DBA – is required to accommodate procedural logic administration. Procedural DBAs require both database and programming skills. They should be responsible for database-related programming activities, with administering stored procedures, triggers and UDFs among their primary responsibilities. Whether these objects are actually programmed by the procedural DBA will depend on the size of the shop, the number of DBAs available and the scope of implementation. Minimally, the procedural DBA should lead code reviews and manage the use (and reuse) of database procedural logic. Additionally, the procedural DBA must be on-call in the event of a stored procedure, trigger or UDF failure.
 

Other procedural administrative functions can be allocated to the procedural DBA, including application code reviews, access path review and analysis, SQL debugging and complex SQL analysis. These are areas in which many DBAs are inadequately trained. It's a distinctly different skill to program than it is to create well-designed relational databases. Yet, DBAs quickly learn that they must be able to understand efficient application programming techniques. Offloading some of these tasks to the procedural DBA will enable the traditional, data-oriented DBAs to concentrate on the actual physical design and implementation of databases. This should result in much better designed databases and better overall performance.

The PDA DBA

Mobile workers are here to stay. And DBAs must be ready to support them with a valid, shared data infrastructure.

While a PDA database shouldn't require the in-depth tuning and administration required of enterprise databases, the DBA must be involved in the initial design of PDA databases to ensure they're optimal for the handheld environment. But design is a minimal concern. The big impact is in planning for and managing the data synchronization from hundreds or thousands of PDAs. When should synchronization be scheduled? How will it impact applications that use large production databases that are involved in the synchronization? How can you ensure that a mobile user will synchronize his data reliably and on schedule?

These aren't minor issues. Before enabling a large contingent of PDA database users, the DBA staff must be prepared for the impact to enterprise databases. The primary impact will be analyzing and understanding data synchronization technology, as well as evaluating the need for remote database users within the organization. Which application systems will be impacted first? Typically those with remote workers such as sales systems or delivery tracking are likely to be first. Take some time to review the data requirements of those applications and how a large influx of remote connections might impact the current systems.

The major DBMS vendors offer small footprint versions of their flagship products to run on PDAs and other handheld devices. (IBM's offering, for example, is DB2 Everyplace.) They're geared for storing a small amount of critical data that's later synchronized to long-term data stores.

The Speed of Change

A final overriding database trend is the overall speed of change. Most organizations cannot implement new DBMS versions as fast as the vendors deliver them. DBMS vendors are on a 12- to 24-month cycle for releasing major new version of their products. And in between there are a bevy of PTFs and fixes. Keeping up-to-date is a major struggle.

Indeed, many users are still running old versions of their DBMS because of the migration and testing required to move from one version to the next. This is so even when the vendors make an effort to simplify the process. Programs need to be run to ensure that performance is similar for the new release, sometimes data needs to be migrated to a new format, and older features are increasingly being dropped from new releases, which can require reprogramming functional programs and databases.

The Role of Automation

Database management is getting more complex which in turn increases the skill-level required to successfully administer databases. Today’s DBA must be skilled not just in database design and tuning, but in Internet/Web technology, procedural logic, complex data types, pervasive computing, change management and other areas.

Indeed, a successful DBA must possess more skills than most technicians can master. However, intelligent automation can help address this problem. IT professionals spend their career delivering systems that automate everyone else’s job, but we have yet to intelligently automate most of our DBA tasks. Automating some of our tedious day-to-day tasks can free up time to learn about new DBMS features – and allow us to implement them appropriately.

But simple automation is not sufficient. The software should be able to intelligently monitor, analyze and optimize applications using past, present and future analysis of collected data. Simply stated, the software should work the way a consultant works – fulfilling the role of a trusted advisor.

By deploying intelligent automation, the underlying complexity of the database environment can be somewhat simplified, enabling the DBA to successfully manage the modern database environment. In this day and age, anything less is a recipe for failure.

 

 


From eServer Magazine, Mainframe Edition, September 2002.
 
© 2002 Craig S. Mullins, All rights reserved.
Home.