July 24, 2008

 

  Articles
Return to the Article Index
 

SQL Server - Data Cleansing on Large Text Columns

Recently I performed a data-cleansing task for a large publishing firm that felt very much like microsurgery. The requirements were simply stated as convert all forms of a specific type of term to uppercase in a table containing over 30,000 rows of article data Upon gathering information from the client, I requested a source of the terms that I could use for locating the suspect entries and make the conversion. That's when things got interesting and challenging. First, there was no such reference list in existence, and second, the data existed in a variety of formats. In summary, there was over 8,000 different terms that could exist in the set of rows, and each could be in a variety of formats. Furthermore, the data is contained in the SQL Sever 7.0 nText data type with the largest row containing approximately 100,000 characters.

The process for solving this puzzle involves a two-step process. Step 1 identifies the data that will be changed and it's location. Step 2 executes the process against the data itself. Building in an extra step allows the developer to verify the algorithm and generate interim reports for management and quality assurance team members to verify the process before actually affecting the data. One critical aspect that must be followed throughout the process is what I like to call "Balance". If the rules are very intricate, unrelated data might be adversely affected, and the converse is true as well. Not enforcing enough rules will invalidate the effort as well. Keep in mind that a panacea to many data cleansing efforts is non-existent and that expectation setting with the client and/or management is critical to a successful outcome. We established a very fair 80/20 rule for this process.

Step 1 involved a process of splitting the contents of each row into individual words and storing them in a table that held the word, it's position in the original data row, and the ID of that row in the original source as well. We broke on spaces and the close tag '>' for HTML because the table rows also contained these tags as well. The process was done using an INSENSITIVE CURSOR. As you can expect the table grew in size quickly. Actually the row count was approximately 850,000. Next, I created a set of SQL statements that would extract rows from the secondary table allowing me to experiment with different combinations that would yield only the data we wished to affect. The optimal statements were used to generate a report that would show the item being converted, it's article, and how it would appear after modification using the UPPER function.

Step 2 also involved the use of an INSENSITIVE CURSOR to loop through the reference table that was created from Step 1 and make all of the updates. This was easy because my secondary table holds the value; it's position, and source id. I had to use the UPDATETEXT command as we were affecting nText data. Additionally, the Select Into / Bulk Copy option was used as well to disable logging for the use of the UPDATETEXT statements.

Lastly, I built a small ASP page with an OLEDB connection into the database that allowed the QC staff to check the contents of the articles against the original data. The OLEDB provider allows me to avoid the problems encountered with large text fields and ADO result sets that occurred using the SQL SERVER Driver. The script was run against the live data during the overnight as not to impact the server performance. Execution time averaged 1.5 hours.


eBallot - The Future of Elections

Firstly, this is not an attempt for me to state my political affiliation. This is merely an expansion of a discussion that several colleagues and me had regarding the potential use for electronic polling devices in future elections. This discussion surfaced as a result of the recent events surrounding the presidential election with respect to the clarity of the ballot format, the process of paper balloting, and the availability of real-time results information.

The ability to record and quantify the results would be quite easy considering the use of the Internet and the availability of inexpensive high performance hardware. The core issues for tackling the problem pertain to the user interface of an electronic polling booth and the method for securely capturing and moving the data.

The interface itself would need to be a large non-glare display that would present only several key ballot items on the display at one time. These items would contain text and images to cater to those who are unable to read or those who are visually impaired such as the elderly. An audio interface with an earpiece would be available for blind voters. Ballot selections would be made by the user tapping an entry on the screen with a stylus that is connected to the booth, much like a pen on a string or chain. The user would be prompted to select specific items on the screen prior to starting to insure that they are aware of the operation of the e-Ballot before starting. The mechanism for scrolling on the ballot would be a physical slider bar on the right hand side of the display that would smoothly move the display up or down depending upon the users actions. Essentially user experience is enhanced by the use of both graphical content and manual navigational devices.

The idea of introducing an electronic method for balloting is not new. The ability to securely capture and record the voters selections is now possible with the usage of encryption technology that is more effective than that used by online e-commerce sites. The use of the stylus introduces a type of security commonly used with Smart Card technology. Security can be defined as something you know, who you are, and something you have. Most uses on the web involve the first two; yet as biometrics and other technologies mature, then the third will become more commonplace. Smart Card technology bridges that gap by requiring the user to possess a physical device prior to and during all system interaction. Encryption technology elevated to 128-bit security using a series of complex mathematical functions and algorithms can guarantee a secure transaction. The stylus presented above would act as the Smart Card device. This would prevent voting tampering from occurring beyond the polling locations.

Data captured from the polling location would need to be replicated thousands of times in distributed locations to insure redundancy and re-correlated using a CRC-like algorithm for verification. A real-time system such as this would probably require some enhancements to the process of verifying identification at a polling location and the process for submitting the ballot as well. There are implications to having such real-time information which could alter voter behavior.


SPECIAL ISSUE - Creating a Web Based Management Tool for Microsoft SQL Server Jobs

This article is part of the "Special Issue" series where you are provided with a high level of detail about a given area. In this issue, you will learn how to develop both the front and back end components for a web based tool that manages SQL Server jobs. The tool called "JobWatch" allows the user to view the last execution status of scheduled jobs and force a manual execution if desired. The audience level for this topic is rated Intermediate, due to the required understanding of ASP, SQL Server implementation and administration.

"JobWatch" is a single ASP page that makes use of the sp_help_job and sp_start_job system stored procedures for monitoring or starting jobs respectively. The page itself contains a table filled with the job status information and a green arrow "play button" that allows the user to start a job. The center of the page contains a "Refresh" button that updates the last status and execution date/time stamp column for the jobs. You can explore using an automated refresh method with JavaScript as a way to enhance the tool.

 

We will start the lesson by creating the stored procedure in Microsoft SQL Server 7.0 that will trigger an existing job. I'm assuming that you have an existing job that you wish to place under the control of the "JobWatch" tool. The stored procedure should be created as follows:


CREATE proc sp_GetData
as
declare @returncode int
 
exec @returncode = msdb..sp_start_job @job_name = 'dtsGetData'
 
return(@returncode)

The procedure is rather simple and makes use of the Master Database procedure for executing jobs. We will not make any use of the Return Code for this exercise. You should also remember the SQL account that is required to connect and run the above stored procedure. We will assume it is user "jobwatch" and password "cooltool". Test the stored procedure in the SQL Query Analyzer by executing the following line and then monitoring the output in the Job section of the SQL Server Enterprise Manager:

exec sp_GetData

The last thing you want to do on the SQL Server side is gain an understanding of the parameters and results for the system stored procedure sp_help_job.

Next, we will create the ASP page that glues the functionality together and creates our useful tool. You will use the Dictionary object to store the description and name of the jobs you wish to monitor and control. The ASP script will provide the following functionality:

If the page is refreshed as the result of a Job Execution request (user presses the green arrow "Play Button"), execute the stored procedure for the job we created above. Next all of the necessary Dictionary Objects are loaded with the Job information and equivalent textual representation for the job status information that will be returned from the sp_help_job stored procedure. Notice that the status information is loaded into the Dictionary using color and font formatting. This simply adds flavor to the display for different job outcomes. Lastly, the stored procedure sp_help_job is executed and the resultset is used to build a table with all of the corresponding last status and execution information. This implementation was simplified by naming the stored procedures and associated jobs with a naming convention that allowed the top part of the program to determine which stored procedure to execute based upon the job name.

Download the ASP Listing for JobWatch

The program does make use of the adovbs.inc include file for the use of ADO constants. You might want to explore the use of a single stored procedure that executes any of the jobs requested by accepting a parameter.


Database Transformation Service (DTS) Jobs and Permissions

This article is a follow-up to one that appeared last month titled "The Power of Microsoft SQL Server 7.0 Data Transformation Services." The previous article described the use of Microsoft SQL Server 7.0's Database Transformation Services (DTS) for the importing and exporting of data from a variety of sources. Using DTS in such a manner is quite intuitive when following the Wizard or Package Designer Interface, which allows the user to interactively design and execute their packages. The difficulty begins when the user wishes to run these DTS packages in an unattended manner on a schedule with data being imported or exported to some location on the network other than a SQL Server database.

DTS packages use the current users Windows NT and/or SQL permissions when execution is done interactively from within the Microsoft SQL Server Enterprise Manager. The Windows NT permissions allow the package to gather data from a network location such as a folder and the SQL credentials allow this data to make it's way into the designated tables. When attempting to start the job that is associated with such a package, the SQL Agent's set of permissions is used. If the SQL Agent does not have permissions to go to a network folder and read it's source, the job will surely fail. Furthermore, when using a Microsoft Access database as the source, be sure that the SQL Agent has WRITE permissions as well as READ permissions, because opening an MDB file requires the creation of an LDB (Lock) file. The importance of this topic is realized when you have a system that users interact with during normal business hours and whose data is imported at night on a scheduled basis.


Getting the Right People for the Job

When building a team from existing resources, or looking to hire from outside the firm, always remember that their technical and personal skills will have a direct effect on the deliverable. Having an individual with an excellent technical skill-set is paramount; yet their ability to function diplomatically, confidently, and especially enthusiastically in a team approach is equally important.

Many companies are shifting to the team room approach commonly referred to as "Solution Labs" or "War Rooms". These are environments where developers work in close quarters with each other and emphasis is placed on open communication for resolving technical issues facing the team. The candidate must be able to articulate their own thoughts and express them in a manner conducive to the group environment with confidence and not be intimidated in such a setting.

Understanding the context of the candidate is helpful in assessing their fit for the job. Work with the candidate to insure that any apprehension on their part is not a lack of knowledge, but maybe just nervousness. Stay on track with questioning as it pertains to the role they must fill. Provide the opportunity for the candidate to describe a challenge they encountered and the method they chose for resolution. Often an energetic person will gain more ground on a project then one who is more technically savvy, but less motivated. One good technique for interviewing senior level resources is to provide them with a question in an area unknown to them. This type of questioning will usually not yield the perfect answer, but the best candidates will diplomatically present ideas and ask questions to gain an understanding of the topic.

Does the candidate really love the type of work they've done in the past and are applying for in your firm? Are they motivated to seek answers for the most efficient and effective solution to a problem? A good candidate will often express an interest in the technology beyond what he or she does in the normal job setting. This might include reading journals, teaching, or managing the content of a web site.

Lastly, ask yourself the following question when ready to make a decision: "Would I hire this person to work on this project if this were my own business"


The Power of Microsoft SQL Server 7.0 Data Transformation Services

The ability to import and export data to and from Microsoft SQL Server 7.0 has been simplified with the use of Data Transformation Services (DTS). This tool replaces the Bulk Copy Program (BCP) that was used in a similar capacity in versions of the Microsoft SQL Server product prior to the version 7.0 release. BCP required the user to manipulate a text file that defined the order, type, size, and rules for importing data. The utility worked well; yet was less intuitive than DTS. The user can now define complex import and export tasks called "Packages" by using the DTS Wizard, or the intuitive graphical designer tool. In either case, the user can define the source, destination, and a set of rules for moving the data. These rules can be specified using SQL queries and VBScript. Once the Package is defined, DTS will create a JOB for the user that can be scheduled and enabled when desired. The truly exciting part of this utility is that it is extremely fast, supports a variety of sources and destinations, and is easily customizable.

Recently I imported 4 large tables in under 5 seconds from Microsoft Access into a Microsoft SQL Server database. The network at this location was slow and the source and destination objects where in two separate facilities in Boston. DTS can be used for some really powerful commercial applications. It is simple to define a Package that can run on a scheduled basis that extracts mission critical SQL Server data and loads it into a Microsoft Excel Spread Sheet for use by management. Another practical application involves importing large data feeds into SQL Server, which is a common practice in the Insurance and Health Care industry.


Creating Link Tables in Microsoft Access without using a DSN

This is a quick tip for Microsoft Access developers who want to programmatically link to tables in a Microsoft SQL Server database. The developer gains control over the connection to the database and tables, and is able to shield the user from possible problems in a graceful manner. In addition, the need to maintain a DSN on the users machine or a shared location is eliminated. The code sample provided assumes that the table SourceCompany has not been linked. You could include logic to detect the existence of the LinkedCompany table in the tabledef object, or use an on-error statement and block to detect the failure when the table already exists in the collection.

Dim dbcurr As database
Dim tbd As TableDef
 
Set dbcurr = CurrentDb()
Set tbd = dbcurr.CreateTableDef("LinkedCompany")
 
With tbd
.Connect = "ODBC;DRIVER=SQL Server;SERVER=server;" & _
   "UID=username;PWD=password;DATABASE=database"
.SourceTableName = "SourceCompany"
End With
 
dbcurr.TableDefs.Append tbd

The code creates a tabledef and current database object. Next the desired link table name is specified in the CreateTableDef statement. This is a placeholder at this point because we have not established the link yet. The With clause defines our ODBC connection and the name of the external table we wish to link. Lastly, the new tabledef is appended to the tabledef collection of the current database. The link will persist when the Microsoft Access database is closed. Remember to press the F5 (Refresh) key in the Tables window in Microsoft Access for the new link table to appear.


SPECIAL ISSUE - Effectively Handling Control-Breaks

The term Control-Break normally refers to the handling of data that comes back in a format where the developer must implement logic to determine when to start a new section heading or produce sub totals. Usually this is handled in a loop where values are stored in variables called "Current" and "Previous" and they are continually checked. When an inequality is detected, the program performs another action, sets the variables, and then continues execution. The code for these situations can become a bit confusing especially if the hierarchy (Control values) are several levels deep. The following diagram represents the hierarchy that we want to manipulate: (Note: The colors have been included to clarify the explanation.)

 
Parent Level Category
Sub Level Category
Child
Child
Child
Child
Sub Level Category
Child
Child
Child
Parent Level Category
Sub Level Category
Child
Sub Level Category
Child
...
...

Note that the number of Child rows under each Sub Level Category varies and that in this case, any number of Sub Level Categories and Parent Level Categories can exist as well.

A solution to this scenario involves an organized approach to returning the data to the program with embedded codes that instruct the program how to handle each row. The program logic is simplified and the probability of an error is minimized. The following represents the data that would be returned from a SQL statement using the scenario from above:

 

SELECT d.ParentCategoryLevel,
d.SubCategoryLevel,
d.Child
FROM tblData d
ORDER BY d.parentCategoryLevel, d.SubCategoryLevel, d.Child

 

  Parent Level Category Sub Level Category Child
  Parent Level Category Sub Level Category Child
  Parent Level Category Sub Level Category Child
  Parent Level Category Sub Level Category Child
  Parent Level Category Sub Level Category Child
  Parent Level Category Sub Level Category Child
  Parent Level Category Sub Level Category Child
  Parent Level Category Sub Level Category Child
  Parent Level Category Sub Level Category Child

The data is returned in this manner for performance reasons. Executing sub-queries (stored procedures) at each level would be inefficient. Now imagine the data is returned with the proper integer codes:

 
  Code 1 Parent Level Cat. NULL NULL
  Code 2 NULL Sub Level Cat. NULL
  Code 3 NULL NULL Child
  Code 3 NULL NULL Child
  Code 3 NULL NULL Child
  Code 3 NULL NULL Child
  Code 2 NULL Sub Level Cat. NULL
  Code 3 NULL NULL Child
  Code 3 NULL NULL Child
  Code 3 NULL NULL Child
  Code 4 NULL NULL NULL
  Code 1 Parent Level Cat. NULL NULL
  Code 2 NULL Sub Level Cat. NULL
  Code 3 NULL NULL Child
  Code 2 NULL Sub Level Cat. NULL
  Code 3 NULL NULL Child
  Code 4 NULL NULL Child

Notice that the Code 4 is used to indicate the end of a Parent Level Category section. This terminating code is used by the program to indicate when an operation should be executed, such as closing a file or starting a new heading. The pseudo-code for this scenario would appear as follows ("rs" is the result set containing the returned data):

DO WHILE NOT rs.eof
SELECT CASE rs("Code")
CASE 1
This is a Parent Level Category. Open a file and make the text large and bold.
CASE 2
This is a Sub Level Category. Indent the text and make it bold.
CASE 3
This is a Child row. Indent and use standard text.
CASE 4
This is a terminating row. Close the file.
END SELECT
rs.MoveNext
LOOP

The SQL (Structured Query Language) statements are the real key to this solution. By using the UNION and ORDER BY statements, the data can be pieced together in the format we desire. The following SQL statements achieve the objective:

SELECT ParCL = d.ParentCategoryLevel,
SubCL = NULL,
Child = NULL,
Code = 1
FROM tblData d
UNION
SELECT ParCL = d.ParentCategoryLevel,
SubCL = d.SubCategorylevel,
Child = NULL,
Code = 2
FROM tblData d
UNION
SELECT ParCL = d.ParentCategoryLevel,
SubCL = d.SubCategoryLevel,
Child = d.ChildDataItem,
Code = 3
FROM tblData d
UNION
SELECT ParCL = d.ParentCategorylevel,
SubCL = NULL,
Child = NULL,
Code = 4
FROM tblData d
ORDER BY ParCL, Code, SubCL, Child

Using the ORDER BY statement, we can take all of the rows and order them in the desired format. This solution will return more rows which will impact performance depending upon the size and number of data rows being scanned. I used this solution to early-render HTML pages where the concern for performance was important; yet the extra few seconds was acceptable. The "payback" with this approach comes in the form of cleaner and more manageable code. The strategic use of indexes and other tuning enhancements can help performance as well.


Web-based Surveys – An effective and efficient method for field data collection.

Over the years I've worked on projects that involved some form of data collection and reporting. These efforts were usually industry specific and involved collecting information from various companies on compliance with government regulations and adherence to recommended practices. After the data was collected from the field, reports were generated in a variety of summary formats. The reporting process was usually a set of Microsoft Access queries and reports with embedded formulas. Data collection is the most difficult process to achieve accurately and in a timely manner due to the dependency on your target respondents.

Collecting data from the field usually involves several steps. Identifying the data-points required from the field, method for delivering and receiving the responses, and some form of follow-up with the participants. Identifying required information and prodding the participants to complete the surveys will not get any easier; yet the method of delivery improves when utilizing a web-based approach. Consider the following scenario:

I'm a regulatory agency tasked to oversee 100 national companies who engage in the petro-chemical industry. Each is required to document their compliance with various government regulations regarding pollution abatement and waste water management. The information on these companies is stored in either a Microsoft Access or SQL Server database. I have a set of HTML/ASP pages created that contain all of the required survey questions I require along with clear textual help and a 1-800 number to assist the respondents. The start of the survey welcomes the user and provides a friendly summary of the survey process. The last page dumps all of the form variables into the Microsoft Access or SQL Server database. Once I'm content with my survey, I then need to prepare my notification that will be sent to the participants. I go through each row of the Company table and generate an email via CDONTS or other third-party mail object; yet I don't just include a simple link in the email. I want to personalize the interaction, so I append their Company ID to the end of the link. When they click on the link in their email, they are brought to the survey and a nice welcome message is presented including their own name. Of course, each page in the survey might validate the users responses (then ASP's are used - not html) before they proceed to the next page. Once the survey is completed, their responses are stored in the data repository. At scheduled intervals it is possible to query the database and identify anyone who hasn't completed the survey, and send them a friendly reminder. One additional aspect to consider when using this web-based approach. Some companies use older browsers which might not have features enabled such as JavaScript and/or cookies. This could affect the validation logic used on forms and should be factored into the design.

In conclusion, once the data is collected, it is possible to use Microsoft Access or Excel to produce reports. All of these tools can be linked together with little effort.


Keeping your Database behind the Firewall - The Portability of XML

This is a further exploration on the topic of early-rendering web pages to enhance site performance (see "Early Rendering - Portability and Performance"). In this scenario, the database is kept behind the firewall, and all content for the site is published according to schedule or when conditions warrant (i.e. the MSN site). This publish process has been explained in previous articles of the IT News; yet what about capturing the users data? All static (early-rendered) pages are HTML pages; yet they've been generated with HIDDEN form fields containing all or some data displayed on the page itself. The user can select a link on the form to modify the data. This link sends the user to an ASP page that uses the Request("form variable") syntax to retrieve the data from the HTML page. The user can modify the data and then press submit when complete. The data does not get stored in the database; yet is loaded into an XML file using the Microsoft Document Object Model (DOM). The file is then transferred via CDONTS or other third-party mail object to a location inside the firewall. This file can be parsed again using the DOM and presented to an administrator for approval or automatically imported into the data repository. The modifications will be visible to the user community when the next publish cycle occurs. An email can be sent to the user as well to notify them of the posting of their information. This is an excellent strategy for a company-directory/product type of site with high volume.


User Interface Ergonomics - Focusing on the User

I've always had a strong interest in the usability of systems. Not just computer-based systems; but any process that accomplishes some task. Years ago, I had an excellent opportunity to visit Toyota's Kamigo Engine plant in Nagoya, Japan. The level of automation in the facility was extremely high with all types of numerically controlled machines placed strategically along the assembly line. Despite this operations management miracle, their was a definite concern for the human element. Any type of difficulty faced by the worker needed to be rectified and resolved as not to affect the finely tuned cycle times. Consideration for the users physical and emotional well-being was continually addressed through a feed-back and control process involving workers and a group leader. Many IT professionals spend a bulk of their time concerned with rapid delivery of information to the user's desktop; however, often neglect the importance of usability. Users who are required to fight their way through multiple forms to complete a transaction, or assist a customer, will become distressed. This can lead to poor productivity, lower levels of customer service, and a loss of revenue. It's essential to step back and look at a process. "What is the intent of the process and how will the users expect it to function?" Get prototypes in front of the users and listen to their feedback on all details including items such as colors and fonts. Many small points, if not addressed, can become large headaches later on. These considerations apply to both web-based and traditional multi-tier solutions.


Storing HTML in the Database - The Advantages and Disadvantages

This article is similar to one that appeared in the March 2000 issue titled "Early Rendering - Portability and Performance". Building upon the strategy of early rendering for speed and platform independence, the publishing duration must also be considered. A database that receives frequent updates might require all HTML pages to be pushed out to the site several times each day. Certain layout and detail information on each of these HTML pages can be pre-processed in the database and stored in a single varchar field as a string containing both HTML tags and column data for the record. This will dramatically improve performance when looping through the ADO Result Set in VBScript and writing out each detail page. Building the string in VBScript, handling each ADO column independently, will degrade performance. Using a single ADO column with the HTML, along with includes for the Header and Footer will super-charge your solution. I've recently reduced the rendering time of 1600 pages from 45 minutes to 22 seconds using similar methods. These 1600 pages included all sorts of Index, Detail, and Category lookup pages as well.

The improved performance does come at a price. Storing HTML in the database is difficult to maintain. Getting to the HTML can be inconvenient, and even less friendly to edit in a tool that doesn't illuminate HTML and ASP tags in an intuitive manner. These issues can be minimized by placing a minimal amount of formatting information in the varchar field and limiting yourself to just the concatenation of the column data. Other elements such as headers and footers, which are not dynamic, should be handled as include files.


The Power of Microsoft Access - Reducing Risk

Recently I was asked to develop a simplified administration tool for a Microsoft SQL Server database. The limited number of in-house users would want to have secured access to the database and make routine modifications to it's content throughout the business day. To accommodate this request we opted for Microsoft Access as the tool of choice due to it's built in functionality for handling the users interaction with each record and tabular/datasheet display characteristics. Creating a simple Access Form and creating Link tables to the SQL database was a breeze and completed in just several hours. Taking actions like this, especially with proven tools and techniques, will minimize your risk and cost on any IT effort.


The Future of Wireless Web - A Prediction

As wireless web becomes more popular and affordable, we will see a gradual migration towards computing devices that no longer require direct connectivity via wires and hubs. Users will interact with the web and corporate environments in virtual domains. A corporation will contain a virtual domain for it's users to access secured servers, email, and printers within there facility, yet another virtual domain encompassing their business district might also exist providing traffic and local event information. Imagine the marketing potential for advertisers placing highly targeted content on your laptop as you drive through a virtual domain (i.e. Times Square, New York). Users could choose to subscribe to these virtual domains as they drive through a foreign area seeking lodging and dining information.


Earning Millions without a Computer - A True Business Case Study

A Boston area dental practice earns millions in revenue each year without relying on any IT infrastructure. The practice contains 7 patient treatment rooms, 15 employees, at least 5000 annual bookings, and specializes in general, pediatric (child), and endodontic services. The office utilizes modern x-ray machines in every room, constant interaction with an outside lab for various dental appliances such as dentures, and participates in many major insurance provider programs. The practice has been serving the community for over 50 years, and has really focused on high-quality personalized care, which leads to numerous repeat customers seeking a variety of treatment plans from oral hygiene to emergency surgery.

Essentially the practice generates a level of customer demand that they can efficiently match with quality service. The added cost of implementing an automated office management and marketing system, as well as the increased customer demand that such a system might generate, might actually harm the balance they've established. The dental office is not an IT shop. The goal of the business is to generate revenue, provide some form of satisfaction to the dental practitioners who enjoy practicing their skills, and to serve the community with excellent care. They are doing this quite well and agree that deferring an IT implementation is probably the best decision at this time. This conclusion is often overlooked and leads to more than just toothaches after time and money has been spent.


Populate JavaScript DHTML with Database Content

Recently I was tasked with the requirement to place a collapsible tree on the main page of a very prominent web site. I knew this involved dynamic HTML and found a JavaScript sample and code that would be an excellent fit. Later my client informed me that the content of the tree would change quite frequently and that they'd really like a data-driven solution for this form element. This presented an interesting problem because the tree was all JavaScript (client-side) and any database calls were reserved for VBScript (server-side). After careful consideration I arrived at a rather simple; yet flexible solution. I'd build a string of JavaScript code while looping through the ADO Result Set in my VBScript, and set a hidden form element to this value. Upon the execution of the "OnLoad" function in the <BODY> tag, a call would be made to a JavaScript Init() function that would evaluate the value of the hidden form element containing the string of JavaScript commands. There were some slight differences in the order of interpretation between Microsoft's Internet Explorer and Netscape's Navigator browsers. The server-side building of the string was built before the <BODY> tag to satisfy Netscape's Navigator browser. This was indeed a fun challenge.


Data Cleansing Tips & Tricks

Data cleansing efforts can be resource intensive and should contain a checkpoint at some specified point where a "go" or "no go" decision is considered. I've had the opportunity to execute both a manual and automated process at several clients. An automated method usually entails the generation of Match Codes containing standardized names and portions of the address and postal code. These codes are then grouped, compared, and the associated records are collapsed into a specific parent based upon equality and the business unit's rules. A manual process involves a human operator going through the data and modifying, deleting, or merging records, also based upon the business unit's rules. Both approaches usually entail some form of standardization and correction of data such as names containing words like Corporation, Corp., Inc., Incorporated, and Company. States, regions, and countries are another area frequently requiring correction. Some processes involve inspecting area and country codes to resolve missing states and/or regions.  Other creative and accurate methods can be employed to correct a variety of missing values.

I'd like to explain an interesting approach to the manual method described above. Microsoft SQL 7.0 Enterprise Manager now contains a grid-like display of a table much like Microsoft Access. Before the process begins, I insert two columns into a cleansing candidate table. The first column contains a sequence number starting at 1 and going to the record count. This value is assigned once the records are sorted by a common column such as Company Name or Product Number. The user will look for duplicates and usually the duplicates will fall together after a sort is performed. The second column is called a status where the user can enter codes. The codes I use are "D" for delete, "E" for export, "MC,xxx" for merge child with parent at id xxx where xxx is the id from my first column defined above. This allows the operator to sort on any column; yet preserve their codes and dependencies. After the operator is finished, I execute SQL scripts that look at the codes and execute accordingly. Each code is preceded with a character such as 'x' to identify it as being completed and to indicate to the script that it should be ignored on subsequent executions. Keeping the rows until the end allows me to re-visit and correct any anomalies if they arise. Additional codes can be defined and built into the SQL scripts depending upon the requirements of the business unit's rules and requirements.  Prior to delivery, I remove the additional columns.


Microsoft Remote Scripting - An ASP Miracle

Remote Scripting technology allows the developer to deliver ASP web solutions that behave similar to their traditional client/server counterparts.  A great use of this technology involves a dependant listbox scenario.  The user selects a country from one dropdown, and the regions or states within that country appear in another dropdown without refreshing the page.  This technology incorporates a server-proxy architecture which is only required on the server and is freely available.  Remote Scripting currently is supported by both Internet Explorer and Netscape Navigator; yet the developer will encounter differences in the method of invocation.


Early Rendering - Portability and Performance

The reliance on database technology to support the dynamic content of web solutions can lead to a degradation in performance as the number of hits to the site increases.  One leading publishing company in New England employs a technique called Early Rendering to meet this hurdle head-on.  Developers create ASP components that generate HTML pages during the publishing phase using Microsoft's Interpress product.  All database calls are completed during the publishing step, which generates all the required pages.  Surely this can lead to an increased number of pages for e-commerce solutions; yet the performance improvements are remarkable when considering users lose interest after just a few seconds of waiting.  Using this technique to generate only HTML output can lead to platform independent solutions.  Currently the New England company deploys on the NT platform.


Traditional Client/Server Solutions - A Deployment Nightmare

Creating applications in tools such as Visual Basic traditionally required a business entity to rely on the services of their internal IT group to deploy these applications and any subsequent releases to their user community.  Imagine the logistical nightmare involved when a portion of the staff consists of field sales representatives that rarely see the corporate home office.  Web based solutions can be deployed on a secure Intranet and made instantly available to all users without the installation hassle.  Remote users can connect to these same systems using SSL or other PK encryption technology. Another benefit involves the notion of a thin-client architecture.  End-user computers would only require an Internet browser and the corporate standard issue of productivity software such as word processors and spread sheet applications.


 Back to Top


Copyright © 2000-2005 Margolis Consulting Group, LLC. All Rights Reserved.
Terms & Agreement under which this information is provided to you.
Read the privacy guidelines.