Saturday, August 01, 2015

NYOUG July 2015 Session on Tracing and Profiling

Many thanks to all those came to attend my two sessions "The Art and Craft of Tracing" and "Profiling for Performance in PL/SQL" at New York Oracle Users Group meeting on July 29th. Here you can download the presentations as well as the scripts I used for the demos.

Presentation: Tracing
Presentation: Profiling
Scripts for both Tracing and Profiling sessions.(ZIP file)

As always, your feedback will be highly appreciated. Do write to me about what you liked, or didn't like and how it helped you at work.

Sunday, April 12, 2015

No Data Loss without Synchronous Network

I don’t usually write about specific products; but once in a while I encounter something worth talking about, especially if it addresses a very common issue anyone with datacenter management responsibilities will understand: avoiding the unavoidable loss of last minute data changes in database after a disaster but without expensive synchronous replication. This product solves that problem with an innovative out of the box approach. The product is Phoenix Systems from Axxana.

Replication and Last Mile Transactions

Imagine this scenario: the datacenter was on fire. All components—servers, storage, network—are destroyed. You can’t salvage anything of value. What happens to the data you need to continue your business? Fortunately you have a disaster recovery plan. You replicate the database to a remote datacenter using either Data Guard (from Oracle), or some storage level technology such as EMC’s SRDF. So the database comes up on the remote site; but there is a problem. Since you used asynchronous replication (such as maximum performance mode in Data Guard), not all changes have made it to remote site. The final few changes, usually known as last mile transactions were yet to be shipped to the remote site when the disaster struck. These are lost forever in the fire. While you do have a database at the remote datacenter, and that database is almost complete—the operative word is “almost”; not 100%, mind you—some data is inevitably lost. What if you can’t afford to be “almost” complete? Organizations such as financial institutions, hospitals, manufacturing concerns and airlines where losing data is simply not an option, this is not a luxury they can afford. Even in case of other organizations where data loss may not be as unacceptable, the loss of data creates a sense of uncertainty, since you don’t know exactly what was lost.

If you need to be 100% certain that all data is available at the remote site, what would you do?

It’s easy, using synchronous replication. All changes on the primary database are reflected on the remote database site in real time. If you used Data Guard, you would have run it in maximum protection mode. In that case, Oracle would have committed the changes in the remote database before confirming the commit at the local site. When the primary site is gone, the remote site is available with 100% of the data. Bye bye, “almost”.

Brilliant. So, why didn’t you do it? Better yet, why doesn’t everyone do it, since it is that simple? Data loss could be unacceptable in the worst case and unsettling at best. Why accept it?

Synchronous Replication

Well, there is a pesky little detail. Synchronous replication means the network connectivity has to be rock solid. Networks have three general characteristics: throughput (how much data they can pump. Imagine a three lane highway compared to a single lane local road), latency (how much time passes to process the data; not the actual speed. For instance, the car on the highway may travel at 70 miles per hour; but it will be a few minutes spent on the ramp to the highway) and reliability (does the data move from point A to point B with 100% accuracy or it has to be re-transmitted a few times).

Synchronous replication requires the throughput to be very high (to support the large amounts change at least during bursts of activity), latency to be very low (otherwise the remote site will get the data late and respond to the primary even later, causing the primary to hold off the commit) and extremely reliable (otherwise the primary may think the remote site is not accessible and therefore shutdown itself to protect the data). If you run the Data Guard in maximum protection mode, you are essentially telling Oracle to make sure that the remote site has absolutely, positively, undoubtedly (add any other adjective you can think of) received the change and has committed.

If Oracle can’t ensure that because of any reason, such as not getting the response in time due to a less reliable network, what choices does it have? If it allows further data changes, the changes are not there at the remote site yet. At this time if the primary database fails, the data is gone. Therefore Oracle has no choice but to shut down the primary database to stop any changes coming in. It’s maximum protection, after all, and that’s what you have instructed it to do.

So, if you decide to use maximum protection, you have to use a high throughput, low latency and extremely reliable network infrastructure. Most public commercial network infrastructures are not. Either you have to contract a commercial carrier to provide this elevated level of service, or build your own, such as using dark fiber. The costs of the network infrastructure become exponentially high, especially when the remote site is farther away from the primary. In many cases, the cost of the network itself may be several times that of the database infrastructure it protects. Owing to the cost limitations, you may be forced to locate the remote site close by, e.g. in New York City and Hoboken, NJ. It will still be exceedingly expensive; and it may not offer the same degree of protection that you expect. These two cities are close enough to be in the same exposure area of disasters such as floods, hurricanes, war and so on. The farther away the remote site is, the more protected your data is; but so is your cost. Could you accomplish no data loss without this expensive proposition?

Until now, the decision was really black and white. If you want no data loss at all, you have no choice but to go for a super expensive network solution. Many companies can’t justify that high sticker price and therefore settle for potential data loss. Many, in fact, make detailed plans as a part of the business continuity efforts on handling this lost data.

It’s assumed that zero data loss is synonymous with expensive network. If you don’t have a high throughput, low latency, highly reliable network, you have to live with some data loss.

Here is the good news. No, it doesn’t have to. Now, it’s possible to have the cheaper commoditized public network infrastructure and still have complete data protection. Allow me to explain.

Data Protection in Oracle

In Oracle, the data files are written asynchronously at different intervals unrelated to the data changes and commits. In other words, when you commit a change, the data files may not have that changed data. In fact the change occurs in the memory only (called a buffer cache) and may not exist in the data files for hours afterwards. Similarly when you make a change but not commit, the data can still be persisted to the data files. Let me repeat that: the data files are updated with the changed data even if you didn’t commit yet. This is the reason why if you have a storage or operating system level replication solution—even synchronous—replicating the data files, the remote site may or may not have the data, even hours after the change.

How does Oracle protect the data that was changed and committed but in the memory, if the data files do not have them? It captures the pre- and post-change data and packages them into something called redo blocks. Remember, these have nothing to do with data blocks. These are merely changes created by activities performed on the database. This redo data—also known as redo vector—is written to a special area in memory called log buffer. When you commit, the relevant redo blocks from the log buffer are written to special files in the database called redo log files, also known as online redo log files. The commit waits until this writing—known as redo flushing—has ended. You can check the Oracle sessions waiting for this flushing to complete by looking at the event “log file sync”. Since the changes—most importantly—committed changes are recorded in the redo log files, Oracle does not need to rely on the memory alone to know which changes are committed and which are not. In case of a failure, Oracle examines the redo logs to find these changes and updates the data files accordingly. Redo logs are very small compared to the data files.

By the way, redo flushing also occurs at other times: every three seconds, every filled 1 MB of log buffer, when a third of the log buffer is full and some other events; but those additional flushes merely make sure the redo log file is up to date, even if there is no commit.

As you can see, this redo log file becomes the most important thing in the data recovery process. When a disaster occurs, you may have copies of the data files at the remote site (thanks to the replication); but as you learned in the previous section the copies are not useful yet since they may not have all the committed changes and may even have uncommitted changes. In other words, this copy is not considered “consistent” by Oracle. After a disaster, Oracle needs to check the redo log files and apply the changes to make the data files consistent. This is known as a “media recovery”. You have to initiate the media recovery process. In case of a synchronous replication at storage or Operating System level, the redo logs are perfectly in sync with the primary site and Oracle has no trouble getting to the last committed transaction just before the failure. There will be no data lost as a result of the recovery process. In case of Data Guard with maximum protection, this is not required since the changes are updated at the remote site anyway. But what about your cheaper, commodity network with asynchronous replication? The redo logs at the remote site will not be up to date with the primary site’s redo logs. When you perform media recovery, you can’t get to the very last change before the failure, simply because you may not have them. Therefore you can perform a recovery only up to the last available information in the redo at the remote site. This is known as “incomplete” media recovery, distinguished from the earlier described “complete” media recovery. To complete the media recovery, you need the information in the redo log files at the primary site; but remember, the primary site is destroyed. This information is gone. You end up with a data loss. Perhaps even worse, you won’t even know exactly how much you lost, since you don’t have the access to the primary redo log files.

Now, consider this situation carefully. All you need is the last redo log file from the primary database to complete the recovery. Unfortunately the file is not available because it’s destroyed or otherwise inaccessible since the site itself is inaccessible. This tiny little file is the only thing that stays between you and complete recovery. What if you somehow magically had access to this file, even though the rest of the data center is gone? You would have been able to complete the recovery with no data loss and looked like a hero and that too without a synchronous replication solution with super expensive network.

Enter the Black Box

Oracle’s redo data can be written to multiple files at the same time. This is called a redo log group and the files are called members. Log flushing writes to all members before confirming the flush. As you can see, all members of a group have the same data. Multiple members are created only for redundancy. As long as one member of a group is available, Oracle can use it to perform recovery. This is where the new tool from Axxana comes in. It is a storage device—named Phoenix—where you create the second member of the redo log groups. The first member of the group is on your normal storage as usual. When disaster strikes and nukes the primary site, you have a copy of the all-important redo log from the Phoenix system.

This is where the first benefit of Phoenix systems comes in. The storage is not just any ordinary one. It’s encased in a bomb-proof, fire-proof and water-proof container which preserves the internal storage from many calamities. The storage has normal connectivity such a network port to connect a network as a NAS and a fiber port to connect to a fiber switch as a SAN. Under ordinary circumstances you would use these ports to connect to your infrastructure system and use the storage. After a disaster, due to the indestructible nature of the enclosure, this storage will most likely be intact. All you have to do is to access the data from it and perform a complete recovery. No data needs to be lost.

But how do you get to the information on the Phoenix system? This is where the second benefit comes in. The Phoenix system transfers its data to another component of the Phoenix system at the remote site. The Phoenix system creates a replica of the required redo logs at the remote site. Since the only data on it are the small redo log files, the amount to transfer is very little and does not put a toll on your other network infrastructure.

Note a very important point: Phoenix system does not perform any transfer of data during normal operation. It’s only during a disaster the system transports the required redo log files to complete a 100% recovery at the remote site.

But it still depends on getting the data out of the Phoenix system that was at the primary site. What if the disaster site is physically inaccessible or it is infeasible to physically transport the Phoenix system to a location where it can be connected to your network? It’s quite possible in case of floods, hurricanes or other natural disasters or manmade ones like war or strikes. The network cables are also likely out of commission after a disaster. Without that access, how can Phoenix system extract the needed last mile transactions from the primary site—you might ask.

No worries; there is a cellular modem built into the Phoenix system that allows you to connect to it from the remote site and transfer the data over a cellular network wirelessly. The system also has its own battery that allows it to stay operational even when external power is gone—a common occurrence in almost any calamity. What’s more: the transfer of data after the disaster can also utilize this cellular connectivity; so you may not even need to physically connect to this storage at the primary site (now defunct due to the disaster) to your network. The data you need to perform the complete no-data-loss recovery may already be transferred over to the system at the remote site and is waiting for you. In any case, you have access to the data. And all this comes without the need to invest in synchronous replication and expensive network infrastructure.

In summary, assuming the primary site is gone and you are using a cheaper asynchronous network, the remote site will be partially up-to-date and the last mile transactions will be lost with the loss of the primary database. However, since you used the disaster-proof Phoenix system for an additional member of the redo log group, the last mile transactions will be intact in that system; the remote system won’t have it yet. At this point you have multiple options:

1)      If the network infrastructure between primary and remote sites is still operational (rare; but possible), Phoenix system data transfer creates the copy of redo logs at the remote site automatically. With this in place, you can perform a complete database recovery. Data Loss: 0%.
2)      If the network infrastructure is not operational, Phoenix will automatically engage the built-in cellular modem and initiate the data transfer to the remote site. It will be slow; but the amount of data to be transferred is small; so it won’t matter much. Once that is complete, your recovery will be complete with 0% data loss.
3)      If you don’t have cellular access either, but have access to the physical Phoenix system device, you can mount it on your network and perform a complete recovery.

As I started off, this is a very simple and elegant out of the box solution to a complex problem. It opens up possibilities for a no-data-loss recovery scenario in case of a disaster, where the option didn’t even exist. In a future post I will describe in detail my hands on experience with screenshots, scripts, etc. with the Phoenix system. Stay tuned.

More about Axxana:

Tuesday, November 18, 2014

Oracle Virtual Technology Symposium: Download

For all those who attended my session today in Oracle Virtual Technology Summit on InMemory, thank you very much. Some questions were cut off towards the end since it was an automated session. I will be happy to answer, if your question was in that category.
Since this was an on-demand show, you can watch the replay anytime by visiting this site. Remember, this requires a connection to watch; it's not a download. You can download the slides from here and the scripts I used in the demo here.

Please post the questions you couldn't do during the webinar.

Monday, November 10, 2014

Cache Buffer Chains Demystified

You must have seen sessions waiting on the event “latch: cache buffers chains” from time to time. If you ever wondered what this means and how you can reduce time spent on it, read on. Here you will learn how buffer cache works, how Oracle multi-versioning works, how buffers are allocated and deallocated, what hash chain is and how buffers are linked to it, what the role of cache buffer chain latch is and why sessions wait on it, how to find the objects causing the contention and how to reduce the time spent on that event.

While exploring the reasons for the slowness in database sessions, you check the wait interface and see the following output:
 SQL> select state, event from v$session where sid = 123;

 ------- ---------------------------
 WAITING latch: cache buffers chains

This event is more common, especially in applications that perform a scan of a few blocks of data. To resolve it, you should understand what the cache buffers chains latch is and why sessions have to wait on it. To understand that you must understand how the Oracle buffer cache works. We will explore these one by one, and close with the solution to reducing the cache buffers chains latch waits.

Thursday, October 02, 2014

OOW14 Session: The Art and Craft of Tracing

A big thanks to all those who braved attending my session on the final day of Oracle Open World 2014. I hope you enjoyed it and found it valuable.

You can download the slide deck as well as the scripts I mentioned here.

As always, I would love to hear from you.

OOW14 Session: RAC'fying Multitenant

Thank you for attending my session RAC'fying Multitenant at Oracle Open World 2014. You can download the slide deck here.

[Updated] Oct 4th, 2014: The article on multitenant I wrote for OTN is available here. This article shows various commands I referenced in my session, e.g. point in time recovery of PDBs.

As always I would love to hear from you.

Monday, September 29, 2014

OOW14 Session: SQL Tuning Without Trying

Many thanks to all those who attended my session "SQL Tuning without Trying" at Oracle Open World 2014 #oow14. I hope you found the session useful.

If you want to download the slide deck, here it is. As always, I would really, really like to hear from you.

OOW14 Session: Cache Fusion Demystified

Thank you all those who attended my session Cache Fusion Demystified.

As I mentioned, it was impossible to fit in a demo within that compressed 45 minute timeline. I put the details of the demo in a separate file with all the scripts and instructions that allows you to execute the demos at your own site.Download the slide deck as well as the scripts. In addition I have also written a whitepaper to explain the concepts clearer.

Download the slides, the paper, the demo scripts and instructions here.

I sincerely hope it demystified one of the most complex topics in Oracle RAC. As always, I would love to hear from you about your experience.

Sunday, September 28, 2014

OOW14 Session: Cache Buffer Chains Latches Demystified

Thank you all those who came to attend my session "Demystifying Cache Buffer Chains Latches" at Oracle Open World 2014. Much appreciated. I hope you got something out your time investment.

I have also written a paper to accompany this session. The paper explains the concepts I presented in greater detail. You can download a zip file containing the slide deck, the paper and the scripts here. Please note: this is a zip file.

And, yes, here is the blog entry that talka about why you need 7X memory to completely fit your database in buffer cache

As always, I would love to hear about your thoughts on the presentation.

One down at #oow14; five more to go :)

Saturday, September 27, 2014

My Small World at Oracle Open World 2014

Here is y speaking schedule at Oracle Open World #oow14. I have a whopping 6 sessions and one SIG. Don't have analyst meeting for a change. Yes, do have a customer reference.

1    1.       Sunday Sep 28th 11:00 Demystifying Cache Buffer Chains. Moscone South 309
2.       Monday Sep 29th 13:30 Understanding Cache Fusion Moscone North 130
3.       Tuesday Sep 30th 15:00 Exadata SIG Moscone South 208
4.       Tuesday Sep 30th 16:00 SQL Tuning Without Trying Moscone South 104
5.       Thursday Oct 2nd 9:30 RAC'ifyng Multitenant Moscone South 102
6.       Thursday Oct 2nd 13:15 Art and Craft of Tracing Moscone North 131
7.       Thursday Oct 2nd 14:30 Near Zero Downtime Migration, Intercontinental Grand Ballroom B

I hope to meet and say hello to many of you at one of these. if you can't make it. the preso, scripts and [possibly] whitepapers will be posted on my blog after the event. But, of course, do come. Looking forward to meeting you there.

And there will be tons of sessions I must attend, at #oow14 and Oak Table World.

Thursday, June 26, 2014

Sun Coast Oracle User Group June 24th Sessions Materials

Thank you to all those who stayed back quite late that night for my two presentations. I hope you found it informative and useful.

As promised, you can download the session materials here.

As always, I will be honored to hear from you.

Wednesday, June 11, 2014

Hadoop for Oracle Professionals Article on Oracle Scene

Oracle Scene (the publication of United Kingdom Oracle Users Group) has published my article "Hadoop for Oracle Professionals", where I have attempted, like many others, to demystify the terms such as Hadoop, Map/Reduce and Flume. If you were interested in Big Data and what all comes with understanding it, you might find it useful.

A PDF version of the article can be downloaded here

Tuesday, June 03, 2014

NYOUG Summer 2014 Conference: Understanding Oracle Locking Internals

Thank you for all those who attended my session: Understanding Oracle Locking Internals today at the Summer 2014 conference of New York Oracle User Group in Manhattan. You can download the presentation and the scripts I used in demos here.


As always, your feedback will be highly appreciated.

Friday, April 25, 2014

Creating Controlfile From Scratch when No Backup is Available

You have lost the controlfile, the catalog and the backup to the controlfile too; so restoring the controlfile from a previous backup is not an option. How can you recover the database? By creating the controlfile from scratch. Interested in learning how? Read on.

Thursday, April 24, 2014

How to Get the DBID when Instance in in NOMOUNT State

You lost your controlfile and the catalog. To restore the controlfile, you must know the DBID. Did you follow the advise to write down the DBID in a safe place? You didn't, did you? Well, what do you do next? Don't worry; you can still get the DBID from the header of the data files. Read on to learn how.

Tuesday, April 22, 2014

Restoring Controlfile When AUTOBACKUP Fail

Allow me to present the snapshot of a day from the life of John--the DBA at Acme Bank. On this particular day a database John manages crashed entirely and had to be restored from the backup. He takes regular (backupset) RMAN backups to tape. Since everything--including the controlfile--had crashed, John had to first restore the controlfile and then restore the database. The controlfile is always backed up with the backup database command. John was sure of that. However, restore controlfile from autobackup gave the error:
RMAN-06172: no AUTOBACKUP found or specified handle is not a valid copy or piece
Without the controlfile, the recovery was stuck, even though all the valid pieces were there. It was a rather alarming situation. Others would have panicked; but not John. As always, he managed to resolve the situation by completing recovery. Interested to learn how? Read on.

Thursday, April 17, 2014

Boston Oracle User Group Session: Oracle 12c Features You Should Know

Thank you for all those who attended the session, and braved it up to 10 PM. Much much appreciated.

Download the slides here, and scripts I used for the demos here.

As always, your feedback will be highly appreciated.

Friday, April 11, 2014

Collaborate14 Session: Cache Buffer Chains Demystified

Thank you all for coming to my session Cache Buffer Chains Demystified at Collaborate 14, especially for sticking around for a geeky topic like this to the very end. Much appreciated.

I was not aware that I would not be allowed to use my laptop; so I couldn't show all the demos I so carefully prepared. Please download the scripts and execute them yourself.

As promised, here are the materials I used in the session

The Slide Deck
The Whitepaper
The Scripts (this is a zip file; so right click and Save As ...)

Needless to say, your comments and feedback will be highly appreciated. And, yes, please don't forget to do the evaluation on the Collab Mobile App.

Thursday, April 10, 2014

Collaborate14 Session: The Art and Science of Tracing

Thank you all for coming to my session "The Art and Science of Tracing" at Collaborate 2014. As I mentioned, I prepared a full session even though this is supposed to be a quick tip. I hope you enjoyed it and get the value from the full presentation deck.

You can download

The slide deck
The scripts (this is a zip file. Right click and then Save As ...)

As always, your feedback will be immensely appreciated.

Wednesday, April 09, 2014

Collab14 Session Should You Drop Indexes in Exadata?

Thank you all those who attended my 8:30 AM session at Collaborate 14 "Should You Drop Indexes in Exadata" (Session 316). I will appreciate receiving your feedback and giving it on the IOUG website and mobile app.

Here are material I presented. I have the slides here; but I suggest you download the paper as it is more standalone in nature.