Large Record Set Compare

Sumeet · August 7, 2004, 6:40am

I have a scenario where I have to two schemas: employer and employees.

employer has 2500 records
employees has 1.5 million records.
linked using an attribute employercode present in both schemas.

I need to find and store all employees belonging to each employer in the employer schema

I have to optimize the current code. The current code I have is a java code, which implements this the simplest way (bubble sort - way too slow), i.e.
1. queries employers
2. Then looping through each employer
3. Picks up employer code for current employer
3. queries employees schema to find which employee’s match the current employer code
4. Stores the resultset under employers
5. continues to next employee

Thus everytime querying 1.5. mil records.

This works but is obviously very very slow, and exceeds acceptable limit by a mile.

Any ideas will be appreciated.

Mark_Kuschnir · August 9, 2004, 1:00pm

Would it be feasible to do this “offline”? I.e. get in memory copies of relevant employer data + employee data, process it locally in the program and then update the database at the end?

For employer: need primary key + employer code
For employee: need primary key + employer code(s)

If the employer code could be represented by a Java int rather than string the searching/processing would be significantly faster.

Topic		Replies	Views
Why can't I update a schema that has data ? Tamino	6	6761	April 2, 2021
Speed with Native Schema Tamino	3	2059	April 2, 2021
How to remove duplicate records in flat file? Managed-File-Transfer	4	521	September 16, 2024
Optimal method to retrieve Lookup Data Tamino	2	9847	April 2, 2021
Getting records sorted without sort by clause Tamino	2	3417	April 2, 2021

Large Record Set Compare

Related topics