Conceptual & Actual: 2008

Saturday, December 27, 2008

Outer joins on non-preserved columns

In this blog I will try to explain the necessity of adding the outer join on already preserved rows.

There are 3 tables, MYUSER, TEST1, TEST2. MYUSER is related to TEST1 table and TEST1 table is related to TEST2 table. The requirement is to fetch all the users from MYUSER table and if present fetch the related data from the TEST1 table and related data from TEST2 table.

The query would be

SELECT USERV.USERID ,USERV.NAME,
TEST1.ID, TEST1.TEST1NAME,
TEST2ID , TEST2.TEST2NAME
FROM MYUSER USERV,
TEST1 TEST1,
TEST2 TEST2
WHERE USERV.USERID = TEST1.ID(+)
AND TEST1.ID = TEST2.TEST2ID(+)
order by 1

The point to note is that “AND TEST1.ID = TEST2.TEST2ID(+)“ condition should be with + sign as shown. This is required because the first join will return all the users even if there is no corresponding record in Test1 table. For those preserved rows, the value will be null for TEST1.ID column. For this row to be preserved when joined with TEST2 table, the + on TEST2 should be maintained.

This can be seen by the following example

MYUSER Table

USERID	NAME
1	Bill
2	Purva
3	James
5	Meij

TEST1 Table

ID	TEST1NAME
1	test1bill
2	test1purva

TEST2 Table

TEST2ID	TEST2NAME
2	test2purva
3	test2james

The query that meets the requirement is

ANSI Syntax

SELECT USERV.USERID ,USERV.NAME,
TEST1.ID, TEST1.TEST1NAME,
TEST2ID , TEST2.TEST2NAME
FROM MYUSER USERV
LEFT OUTER JOIN TEST1 TEST1 ON (USERV.USERID = TEST1.ID)
LEFT OUTER JOIN TEST2 TEST2 ON (TEST1.ID = TEST2.TEST2ID)
ORDER BY 1

USERID	NAME	ID	TEST1NAME	TEST2ID	TEST2NAME
1	Bill	1	test1bill
2	Purva	2	test1purva	2	test2purva
3	James
5	Meij

The query2 without outer join on the second clause is

USERID	NAME	ID	TEST1NAME	TEST2ID	TEST2NAME
2	Purva	2	test1purva	2	test2purva

To be precise, if a table is part of the outer join and the data from the table is outer joined to any other table, that condition also should be outer joined.

Thursday, December 18, 2008

Consistent Read of the Data in Oracle Queries

In some cases, you might want different queries to return same set of data as far as the where criteria defined on the query are the same. For e.g.

Select price from vehicle_price where status=’A’

Select sum(price) from vehicle_price where status=’A’

In the same transaction, if I try to do these queries, there is a possibility that the second query might give a sum that will not be equal to the sum added up by the price returned in the first query. This is possible if new records get added to the vehicle_price table with status as ‘A’ or any of the existing record status changes from ‘A’ to something else.

To ensure that you always get the same data, there are multiple ways to get this done

1. Lock the table and perform queries on it. This ensures that no one else can modify the data while your transaction is in progress. I wont be surprised if you get shouted by other folks who uses this table in some other part of the application.

2. Make the data time sensitive. For every record that gets into the table, there should be a effective start time and effective end time. The query should always query on these columns. So the row doesn’t get modified but a new row gets added with new status. As long as the query uses the effective start date and effective end date in the query, the data returned in subsequent queries will be same. This is a better approach as other users are not impacted. But there could be size and performance implications.

3. Use Serializable isolation level for the transaction which queries the tables. This ensures that all queries are made to wait serially. So till the read transaction doesn’t get over, the new transaction cannot insert data. This also impacts the table usage.

4. Use Transaction read only feature. There are few limitations in using this feature of transaction. This level can be set only for transactions that have only read statements. Oracle maintains a snapshot of the data and all queries are worked upon the snapshot effective the transaction start time. This should be used if its viable to be. A commit or rollback on this transaction will cause the flag to be turned off. Set Transaction Read only ; is the command.

These are few of the ways by which we can ensure that queries return the same data within a same transaction.

Friday, December 5, 2008

Use Decode and Sign function together

Decode statements are useful in queries to a large extent. Sometimes people write the PLSQL code to achieve somethings that are possible through a normal decode statement.

For e.g. the requirement is to add two columns retrieved from the table and if its less than zero, make it zero and if its greater than zero, provide the sum of the two values as the output. This can be done using the decode statement

select column1 , coulmn2, column1+column2 ,
decode(SIGN(column1+column2), -1, 0, (column1+column2))
from tableA

The above query means that if the sum of column1 and column2 value is less than zero, then output the value as zero and if its greater than zero, output the sum as the value. We had to use SIGN function as decode does not support checks like less than or greater than.

JDK Logging

Log4j is widely accepted as a logging component. There is yet another one, the logging component inbuilt in java. This article will discuss about the logging provided by JDK.

Ideal scenario, the entire configuration can be done through the logging.properties file. The logging.properties file should be present in the classpath of the application. In case you want to provide a different properties file, the same can be set using the system property. The command is as shown below

java –Djava.util.logging.config.file=/home/users/mypath/mylogging.properties

The logging.properties file sample is present in the jre/lib folder. This file can be edited to get the necessary logging. The structure of the logging properties file is shown below

#Defines the handlers which can be used by the logger. It can be comma separated. For
# each handler, definition should be provided in the properties file
handlers= java.util.logging.FileHandler, java.util.logging.ConsoleHandler
#Default level of logging in case specific logging level as shown in the last line is not
# mentioned.
.level= INFO
#Details about the filehandler.
java.util.logging.FileHandler.pattern = %h/java%u.log
java.util.logging.FileHandler.limit = 50000
java.util.logging.FileHandler.count = 1
java.util.logging.FileHandler.formatter = java.util.logging.XMLFormatter
#custom logging level
com.xyz.foo.level = SEVERE

By default the logging happens at the package level of the class when you are using the property file for configuring the logger.

The restrictions here are that we can specify only one log file per logging properties file. So in case you need the log to go to multiple files, may be based on the application, it’s not possible through configuration in the properties file. However, you can perform the above requirement by writing a specific logger class.

private void initialize(String name)
{
Handler fileHandler;
try {
fileHandler = new FileHandler(filePath, true);
fileHandler.setFormatter(new MyFormatter());
logger=Logger.getLogger(name);
logger.addHandler(fileHandler);
}
public static MyLogger getInstance(String someName)
{
//add the necessary synchronization blocks
mylogger = new MyLogger();
mylogger.initializer(someName);
return mylogger;
}
//this is sample method. Like this you need to provide implementation for
//all other methods
public void debug(String sourceClass, String sourceMethod, String msg)
{
logger.logp(Level.FINEST,sourceClass,sourceMethod,msg);
}
//in your application code, use it this way
MyLogger logger = MyLogger.getInstance(“namethatidentifiestheloglevel”);
logger.debug(“classname”,”methodname”,”msg”);

In the above code snippet, the log level is not mentioned. The log level can be mentioned in the properties file that is passed as the java.util.logging.config.file property. In this case, the value will be set as

namethatidentifiestheloglevel=SEVERE

There are various levels available, the important ones being SEVERE, INFO, FINE, FINEST, DEBUG etc … Please refer to the sun documentation site as the priority of the log level does matter in the content that gets logged.

Saturday, November 1, 2008

Webservices in Websphere

Webservices support in Web Sphere is enhanced by a service pack as defined in the url (http://www-1.ibm.com/support/docview.wss?rs=180&uid=swg21264563). The benefits it provides as per the extract is

IBM WebSphere Application Server V6.1 Feature Pack for Web Services extends the capabilities of Application Server V6.1 to enable Web Services messages to be sent asynchronously, reliably, and securely, focusing on interoperability with other vendors.

Through support for key Web Services standards, you can send messages:

Reliably - be confident that your message will reach its destination
Asynchronously - communicate reliably even if one of the parties is temporarily off-line, busy, or unobtainable
Securely – rest assured your messages are not vulnerable to attack
Interoperably – flexibility to interoperate with other vendors’ offerings

The Webservices can be developed with the help of common editors like Eclipse or MyEclipse Blue or AST offered along with the Application Server or RAD. These are excerpts from Websphere sites. Please check the websphere sites for more information.

Class Loader Hierarchy

As per the delegation contract, a class should be attempted to be loaded by the parent class loader. If the parent class loader cannot load the class, the request should then be handed over to the child class loader. The process will continue till the class loader which originally received the request to load the class. Still if the class cannot be located in the classpath, a ClassNotFoundException is thrown. The request will not be transmitted to the child class loaders of the class loader which originally received the request to load the class.

Say we have 4 class loaders with names as shown below. ClassLoaderX, ClassLoaderY, ClassLoaderA, ClassLoaderB. The relations between them are shown below

When the JVM starts, all classes accessible from /x/data can be loaded by the ClassLoaderX. Then the control is transferred to ClassLoaderY, which can load the classes present under /y/data folder. The control is transferred to ClassLoaderA as well as to ClassLoaderB which can load the classes under /a/data and /b/data. The classes loaded by the parent class will not be loaded by the child class loader again as the child class loader will check if the classes are already loaded by any of its parent class loaders.

Now when the ClassLoaderB gets a request to load a class, it will delegate the request to its parent class loader i.e. ClassLoaderY. The ClassLoaderY will delegate the request to its parent classloader ClassLoaderX. Since ClassLoaderX does not have any parent class loader, it will try to load the class from its classpath i.e. /x/data. If the class is present in its classpath, it will load the class. If it cannot find the class in its classpath, the request is delegated to the child classloader i.e. ClassLoaderY. The ClassLoaderY will attempt to load the class from its classpath /y/data. If it finds the class, the class will be loaded by the ClassLoaderY. If it cannot find the class, the request is delegated to the ClassLoaderB. ClassLoaderB will attempt to load the class from its classpath. If it cannot find the class in its classpath, the request cannot be delegated to the child classloader of ClassLoaderB. Instead it will throw a ClassNotFoundException.

Consider a scenario where the following jar packaging is done. Class P is having Class Q as its super class. Class Q refers to Class R. P is packaged in p.jar and placed in /b/data folder. Q is packaged in q.jar file and placed under /x/data folder and R is packaged in r.jar and placed in /b/data folder. Can you guess what happens!

ClassLoaderB will try to load P and it will load the class after ClassLoaderY and X attempts to load it. Since P has Q as the super class, ClassLoaderB will attempt to load it. Q will be loaded by ClassLoaderX as Q is placed in the classpath of ClassLoaderX. Now Q refers to R and hence ClassLoaderX tries to load it. It will give ClassNotFoundException as r.jar is not present in classpath of ClassLoaderX. As per the policy, the loader should not delegate the request to child classloader and hence ClassLoaderB will never be asked to load R.

If a class is already loaded by the parent classloader, the child classloader will not attempt to load the class again. However some application servers have written custom classloaders which override these features. Another point to note is that the same class will be loaded by ClassLoaderA and ClassLoaderB if the class is placed in their classpath. This is because the classloader always checks its parent for loaded classes but not its children or siblings.

Java has got bootstrap classloader, extension classloader and application classloader by default. Bootstrap classloader can be mapped to ClassLoaderX, extension classloader can be mapped to ClassLoaderY and application classloader can be mapped to ClassLoaderB in the example above.

Few pointers

Do not place the jars in the bootstrap classpath or the extension classpath just to make the application work. It will create issues which might surface later.
Packaging the application is important because it impacts the class loading order too. All related classes should be packaged together or the proper documentation should be provided as to what order the jars should be loaded.
Place common utility jars and common files in common classloaders path. However caution should be taken if they are supposed to be used in an independent manner. For e.g.: log4j.properties, if loaded by parent classloader, the properties file present in the child classloader path will never be considered and might give annoying results.

Avoding log4j logger statements in log files

We all have used log4j in multiple scenarios and it serves the purpose. There are cases when we don’t want the logger statements from few classes or few packages to clutter the log files. Some other cases, we want the logger statement to be printed in only one log file and not in all log files. These are 2 different cases and I will explain solution for both of the requirements.

Requirement 1: Don’t print the log statements in any of the configured log files. Consider a scenario where we have configured 4 -5 log files at different levels and for different purposes. If we don’t want the logger statement from the classes in a particular package or particular class to appear in any of these log files, we have to use the following syntax.

log4j.category.=OFF

e.g. log4j.category.com.test.welcome=OFF means that all the statements logged from the classes present in com.test.welcome package will not be logged in any of the configured log files.

Requirement 2: Don’t print the same log statements in all of the configured log files. It should be printed in the more specific log file only. Consider the scenario like this

log4j.category.com.pack1.pack2 = INFO, appender1

log4j.category.com.pack1 = INFO, appender2

The statements logged from the classes present in the package com.pack1 will be printed in the appender2 as well as appender1 log files as both of them satisfy the category. Now if we don’t want the logger statements from the pack2 not to print in appender1, then we can use the additivity clause.

# set additivity to false to ensure that the parent appenders do not log these statements
log4j.additivity.pack2 =false

The above line means that the logger statement from pack2 will be present only in appender1 log file and not in appender2 log file. This is because the logger file is configured for appender2 as well as appender1. That means the logger statement got printed only in the specific log file and not in its parent appenders.

Identifying Services

Currently the direction is towards Services. This push is seen primarily in smaller and growing companies. If you look at bigger corporations who already have their business process developed, are they changing their entire application to make it as a service? Not really!!. In these corporations, still there are projects happening in the services part. If the mainframe application (few of them developed long back) need to make its core business process available to the company, Service would help here. The service that exposes business functionality can be exposed as a Service.

The important task is to find out what services are to be created and at what level do we expose it? The service should do a functionality that has business significance. We need to contextualize to the business scenario and then evaluate if the service does business function or not? Let’s take a scenario. A Web Dealer wants to allow users to buy the items from the user’s shopping cart. This can be exposed as a Service. This service however needs many other related services like removing items from the cart, charging the credit card, placing order in shipping system etc. For the Web Dealer, exposing a payment service doesn’t make business sense as it doesn’t perform valid business functionality. Similarly exposing a service to update the shipping system also doesn’t make sense as the user cannot request to ship the items directly. But for a payment company, making the payment against a credit card will be valid functionality. The point that I’m trying to make is that it’s based on the business functionality that the services need to be decided. Food for one can be poison for the other. In the above scenario, the Web Dealer exposes a service to buy items taking the user details, credit card details and payment details as input, perform the entire business operation and return the response back to the client. The Payment Company will expose the Payment Service. The Service of the Web Dealer can make finer calls to the Payment Service and achieve the functionality. The shipping service might be designed as a bean call or a service that is not exposed to external users which will be used by the Web Dealer Service. From a web dealer user perspective, it just made a call to Web Dealer Service and not multiple fine grained calls to multiple services to complete the transaction.

While identifying and designing the services, attempt can be made to make many fine grained services. This gives more reusability. But these fine grained services should not be exposed. These fine grained services should be composed together to a coarse grained service and exposed as a valid service. Exposing too many fine grained services causes too many issues, performance security and network load being few of them.

Every attempt should be made to make the service asynchronous in nature. This means the client should not be forced to wait till the service completes its execution. After the processing is complete, the Service can provide some means to tell the client about the status of the execution.

These are my view points. If you disagree or would like to suggest or debate, please leave your comments.

Conceptual & Actual