Saturday, May 17, 2014

Code Snippet - To retrieve file information in any given folder

public void Main()
              {
                     // TODO: Add your code here
            string curpath = string.Empty;

            //Add your folder path 
            curpath = @"C:\XYZ\ABC";
 
            DirectoryInfo dirinfo = new DirectoryInfo(curpath);

            //Add your file types eg.*.txt, *.doc
            FileInfo[] Files = dirinfo.GetFiles("*.dtsx");

            string str = string.Empty;
            string str_fl_crdt = string.Empty;
            //string str_fl_accsdt = string.Empty;
            string str_fl_updt = string.Empty;
 
            foreach (FileInfo file in Files)
            {
                 str = str + ", " + file.Name;
 
                 str_fl_crdt = file.CreationTime.ToString();
                 str_fl_accsdt = file.LastAccessTime.ToString();
                 str_fl_updt = file.LastWriteTime.ToString();
            }
 
                     Dts.TaskResult = (int)ScriptResults.Success;
              }

Tuesday, December 17, 2013

DWBI - A Business Perspective - Day 5


Questions:

1. Name four key issues to be considered while planning for a data warehouse.
2. Explain the difference between the top-down and bottom-up approaches for building data
    warehouses. Do you have a preference? If so, why?
3. List three advantages for each of the single vendor and multi vendor solutions?
4. What is meant by a preliminary survey of requirements ? List six types of information you will
    gather during a preliminary survey?
5. How are data warehouse projects different from OLTP system projects? Describe four such
    differences.
6. List and explain any four of the development phases in the life cycle of data ware house project.
7. What do you consider to be a core set of team roles for a data warehouse project? Describe the   
     three roles from your set.
8. List any three warnings signs likely to be encountered in a data warehouse project. What corrective
    actions will you need to take to resolve the potential 
   problems indicated by these three warning signs?
9. Name and describe any five of the success factors in a data warehouse project.
10.What is meant by 'taking a practical approach "to the management of a data ware house project?
     Give ay two reasons why you think a practical approach is likely to succeed.

Business Cases:  
  
1. As the recently assigned project manager, you are required to work with the executive sponsor to
    write a justification without detailed ROI calculations for the first data warehouse project in your
    company. Write a justification report to be included in the planning document.
2. You are the data transformation specialist for the first data warehouse project in an airlines
    company. Prepare a project task list to include all the detailed tasks needed for data extraction and
    transformation.
3. Why do you think user participation is absolutely essential for success? As a member of the
    recently formed data warehouse team in a banking business, your job is to write a report on how
    the user departments can best participate in the development. What specific responsibilities for the
    users will you include in your project?
4. As the lead architect for a data warehouse in a large domestic retail store chain, prepare a list of
    project tasks relating to designing the architecture. In which development phases will these tasks
    be performed?

Monday, December 9, 2013

DWBI - A Business Perspective - Day 4

In many companies, some sort of formal justification is needed to initiate and fund an IT project, this is called cost-justification analysis. A rough breakdown of the costs is as follows: hardware-31%; software, including DBMS - 24%, staff and system integrators - 35%; administration-10%.

How can you justify the total cost by balancing the risks against the benefits?

How can you calculate the ROI (Return of Investment)?

How can you make a business case?

Usually the senior management demands a business case as a part of RFI (Request for Information)  to roll out an RFP (Request for Proposal).

"A business case captures the reasoning for initiating a project or task. It is often presented in a well-structured written document, but may also sometimes come in the form of a short verbal argument or presentation. The logic of the business case is that, whenever resources such as money or effort are consumed, they should be in support of a specific business need." - Wikipedia

I want to present few typical approaches taken for justifying DWBI project, pick the best that will work for your organization. Here are some sample approaches for preparing the business case:

1. Calculate the current technology costs to produce the applications and reports supporting strategic
    decision making. Compare this with the estimated costs for the data warehouse and find the ratio
    between the current costs and proposed costs. See if this ratio is acceptable to senior management.

2. Calculate the business value of the proposed data warehouse with the estimated dollar values for 
    profits, dividends, earnings growth, revenue growth,  and market share growth. Review this
    business value expressed in dollars against the data warehouse costs and come up with the
    justification.

3. Identify all the components that will be affected by the proposed data warehouse and those that
    will affect the data warehouse . Start with the cost items, one by one, including hardware purchase
    or lease, vendor software, in-house software,  installation and conversion, on-going support, and
    maintenance costs. Then put a dollar value on each of the tangible and intangible benefits
    including cost reduction, revenue enhancement, and effectiveness in the business community. Go
    further to do a cash flow analysis and calculate the ROI.

Friday, December 6, 2013

DWBI - A Business Perspective - Day 3


Questions:

1. State any three factors that indicate the continued growth in data warehousing. Can you think of
    some examples?
2. Why do data warehouses continue to grow in size, storing huge amounts of data? Give any three
    reasons.
3. Why is it important to sore multiple types of data in the data warehouse? Give examples of some
    non structured data likely to be found in the data warehouse of a Sales and Marketing function.
4. Describe the types of charts you are likely to see in the information delivery.
5. What is SMP (Symmetric Multiprocessing) parallel processing hardware? Describe the
    configuration.
6. What is MPP (massively parallel processing) parallel processing hardware? Describe the
    configuration.
7. Explain what is meant by agent technology? How can this technology be used in a data
    warehouse?
8. Describe any one of the options available to integrate ERP with data warehousing?
9. What is CRM? How can you make your data warehouse CRM-ready?
10. What do you mean by a web-enabled data warehouse? Describe three of its functional features.

Business Cases:

1. As a senior analyst on the data warehouse project of a large retail chain, you are responsible for 
    improving data visualization of the output results. Make a list of your recommendations?
2. Explain how and why parallel processing can improve the performance for data loading and index
    creation?
3. Discuss three specific ways in which agent technology may be used to enhance the value of the 
    data warehouse in a large manufacturing company.
4. Your company is in the business of renting DVDs and video tapes. The company has recently 
    entered into e-commerce and the senior management wants to make    the existing data warehouse
    Web-enabled. List and describe any three of the major tasks required for satisfying the
    management's directive.

Wednesday, December 4, 2013

DWBI - A Business Perspective - Day 2

"Master the basics and rest follows...", below are the questionnaire on data warehouse fundamentals.

Questions:

1. Name at least six characteristics or features of a data warehouse.
2. Why is data integration required in a data warehouse, more so there than in an operational
    application?
3. Every data structure in the data warehouse contains the time element. Why?
4. Explain data granularity and how it is applicable to the data warehouse.
5. How are top-down and bottom-up approaches for building a data warehouse different? Discuss the 
    merits and disadvantages of each approach.
6. Why do you need a separate data staging component?
7. Under data transformation, list five different functions you can think of.
8. What are the various datasources for the data warehouse?
9. Name any six different methods for information delivery.
10.What are the major types of metadata in a data warehouse? Briefly mention the purpose of each
      type.
 
Business Cases:

1. A data warehouse is subject-oriented. What would be the major critical business subjects for the  
    following companies?
    a. An international manufacturing company
    b. a local community bank
    c. An international retail chain
2. You are the data analyst on the project team building a data warehouse for an insurance company.
    List the possible data sources from which you will bring    the data into your warehouse. State
    your assumptions.
3. For an airlines company, identify three operational applications that would feed into the data
    warehouse. What would be the data load and refresh cycles?
4. Prepare a table showing all the potential users and information delivery methods for data
    warehouse supporting a large national grocery chain.

DWBI - A Business Perspective - Day 1

Hi, i would like to drive a series on "DWBI - A Business Perspective" that helps you to look at DWBI technology as a business person, this series will have questions and business cases to work on. I believe in self learning than spoon feeding, because it breaks through the barriers of information availability and helps us explore more.

But still if you prefer the earlier approach, these are the direct references from
"Data Warehousing Fundamentals for IT Professionals" - Paulraj Ponniah, a book that helped my survival as a Business Analyst. I sincerely thank Paulraj Ponniah Garu for his writings and recommend all to go through.

Questions:

1. What do you mean by strategic information? For a commercial bank name five types of strategic objectives.
2. Do you agree that a typical retail store collects huge volumes of data through its operational systems? Name three types of transaction data likely to be collected by a retail store in large volumes during its daily operations.
3. Examine the opportunities that can be provided by strategic information for a medical center. Can you list five such opportunities?
4. Why were all the attempts by IT to provide strategic information failures? List three concrete reasons and explain.
5. Describe five difference between operational systems and informational systems.
6. Why are operational systems not suitable for providing strategic information? Give three specific reasons and explain?
7. Name six characteristics of the computing environment needed to provide strategic information?
8. What types of processing takes place in data warehouse? Describe
9. A data warehouse is an environment, not a product. Discuss.
10.Data warehousing is the only viable means to resolve the information crisis and to provide strategic information. List four reasons to support this assertion and explain them.
11.The current trends in hardware/software technology make data warehousing much more feasible. Explain

Business Cases:

1. You are the IT Director of a nationwide insurance company. Write a memo to the Executive Vice President explaining the types of opportunities that can be realized with readily available strategic information.
2. For an airlines company, how can strategic information increases the number of frequent flyers? Discuss giving specific details.
3. You are a Senior Analyst in the IT department of a company manufacturing automobile parts. The marketing VP is complaining about the poor response by IT in    providing strategic information. Draft a proposal to him explaining he reasons for the problems and why a data warehouse would be the only viable solution.

Sunday, July 21, 2013

The Fuzzy Lookup Transformation

A lookup becomes Fuzzy when it can match to records that are similar, but not identical to, the lookup key.
The Lookup transformation returns either an exact match or nothing from the reference table, while the Fuzzy Lookup transformation uses fuzzy matching to return one or more close matches from the reference table.

Fuzzy lookup can be used for cleansing the data or to check the quality before loading data to the destination.
For e.g. let us consider a scenario where the Employee data coming from the source file may have some of the Employee Names that are misspelt or have bad characters. As per business requirement we want the Source data to be cleansed before loading it to destination. The source data will be matched with the Reference data set(Having correct data entries) and the close or perfect match result set will be loaded in the destination table.

Matching of the records of Primary(source) dataset and reference dataset is done by configuring the “Similarity Threshold” scale in the Fuzzy Look up editor which is scaled from 0-1.

               


Happy Learning!!!