Home » SQL & PL/SQL » SQL & PL/SQL » Compare two tables to find new/modified/removed records (Oracle DB 11g Enterprise Edition 11.2.0.4.0 - 64bit / Windows 7 x64)
icon5.gif  Compare two tables to find new/modified/removed records [message #677149] Thu, 22 August 2019 16:04 Go to next message
nickz
Messages: 39
Registered: September 2013
Location: US
Member
I am trying to find out the differences between two tables and run a procedure only on new/changed records. Details below:

STATS table is populated from a file on a daily basis. SID, LNAME, YOB are key and unique identifiers of a record.

STATS_RESULT table is a calculated table (INS_RISK, CALC_RISK) based on the information from STATS table.

There is a Procedure that calculates INS_RISK, CALC_RISK based on the data in STATS table. Currently, this procedure runs on a daily basis for all the records in STATS table and hence takes a lot of time considering millions of records in STATS table in real-life.

The ask is to trigger a calculation only on new/changed records to reduce redundancy of recalculating the same values on all records in STATS table, thereby helping system performance improvement. I want the procedure to run only on the following records:

  • where STATS table has a NEW record
  • where STATS table has a MODIFIED record (HEIGHT and WEIGHT)
  • REMOVE records from STATS_RESULT table if the same record from STATS table has been removed
How do I compare both tables to find these new, modified, deleted records?

Created two example tables:
CREATE TABLE STATS (SID NUMBER(4), FNAME VARCHAR(100), LNAME VARCHAR2(100), YOB NUMBER(4), HEIGHT NUMBER(5,2), WEIGHT NUMBER(5,2), CITY VARCHAR2(100), STATE VARCHAR2(100));
/
INSERT INTO STATS VALUES (1022,'John','Doe',1974,180,185,'Eagle','Alaska');
INSERT INTO STATS VALUES (1045,'Will','Smith',1984,190,220,'San Diego','California');
INSERT INTO STATS VALUES (1123,'Matt','Ferguson',1968,182,179,'Derby','Kansas');
INSERT INTO STATS VALUES (1205,'Bill','Farm',1977,175,165,'Sanford','Maine');
/
CREATE TABLE STATS_RESULT (SID NUMBER(4), LNAME VARCHAR2(100), YOB NUMBER(4), HEIGHT NUMBER(5,2), WEIGHT NUMBER(5,2), INS_RISK NUMBER(5,2), CALC_RISK NUMBER(5,2));
/
INSERT INTO STATS_RESULT VALUES (1022,'Doe',1974,180,185,22.5,34.52);
INSERT INTO STATS_RESULT VALUES (1045,'Smith',1984,190,220,42.87,55.23);
INSERT INTO STATS_RESULT VALUES (1123,'Ferguson',1968,182,179,65,54.02);
INSERT INTO STATS_RESULT VALUES (1205,'Farm',1977,175,165,33.54,19);
/
COMMIT;

[Updated on: Thu, 22 August 2019 21:05]

Report message to a moderator

Re: Compare two tables to find new/modified/removed records [message #677152 is a reply to message #677149] Fri, 23 August 2019 00:54 Go to previous messageGo to next message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Please post all the constraints of the tables, above all primary, unique, foreign keys.

Re: Compare two tables to find new/modified/removed records [message #677155 is a reply to message #677149] Fri, 23 August 2019 05:08 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
Quote:
There is a Procedure that calculates INS_RISK, CALC_RISK based on the data in STATS table. Currently, this procedure runs on a daily basis for all the records in STATS table and hence takes a lot of time considering millions of records in STATS table in real-life.
What is the calculation? How are you deriving INS_RISK and CALC_RISK?

[Updated on: Fri, 23 August 2019 05:08]

Report message to a moderator

Re: Compare two tables to find new/modified/removed records [message #677157 is a reply to message #677155] Fri, 23 August 2019 09:12 Go to previous message
Bill B
Messages: 1971
Registered: December 2004
Senior Member
We also had the same problem with knowing what changed in a table with almost a half a billion rows. We came up with the following CDC (Change Data Capture) trigger. It captures all activity on the specific table without having to actually turn on database auditing, which would drag down the entire system. In the CDC_STATS table. the ACTION codes are 'I' - Insert, 'D' - Deleted, 'U' - updated. Since the action date is indexed you can quickly and easly find every change since your last run. The code follows

-- Create the CDC (Change Data Capture) table
CREATE TABLE CDC_STATS
(
  SID          NUMBER(4)                        NOT NULL,
  LNAME        VARCHAR2(100 BYTE)               NOT NULL,
  YOB          NUMBER(4)                        NOT NULL,
  ACTION_DATE  DATE                             NOT NULL,
  ACTION       VARCHAR2(1 BYTE)                 NOT NULL
);


CREATE UNIQUE INDEX CDC_STATS_PK ON CDC_STATS(SID, LNAME, YOB);
CREATE INDEX CDC_STATS_I1 ON CDC_STATS(ACTION_DATE);

ALTER TABLE CDC_STATS ADD (
  CONSTRAINT CDC_STATS_PK
  PRIMARY KEY
  (SID, LNAME, YOB)
  USING INDEX CDC_STATS_PK
  ENABLE VALIDATE);

-- Attach the auditing trigger to the table

CREATE OR REPLACE TRIGGER Cdc_stats_t1
    AFTER DELETE OR INSERT OR UPDATE
    ON Stats
    REFERENCING NEW AS New OLD AS Old
    FOR EACH ROW
DECLARE
    Lcaction   VARCHAR2 (1);
    Lnsid      NUMBER (4);
    Lnlname    VARCHAR2 (100);
    Lnyob      NUMBER (4);
BEGIN
    -- Handle the primary key being modified
    IF     UPDATING
       AND (   :old.Sid <> :new.Sid
            OR :old.Lname <> :new.Lname
            OR :old.Yob <> :new.Yob)
    THEN
        UPDATE Cdc_stats A
           SET A.Action = 'D', A.Action_date = SYSDATE
         WHERE A.Sid = :old.Sid AND A.Lname = :old.Lname AND A.Yob = :old.Yob;

        -- if the original row isn't in the CDC then insert it so it can be deleted
        IF SQL%ROWCOUNT = 0
        THEN
            INSERT INTO Cdc_stats (Sid,
                                   Lname,
                                   Yob,
                                   Action_date,
                                   Action)
                 VALUES (:old.Sid,
                         :old.Lname,
                         :old.Yob,
                         SYSDATE,
                         'D');
        END IF;

        Lcaction := 'I';
        Lnsid := :new.Sid;
        Lnlname := :new.Lname;
        Lnyob := :new.Yob;
    ELSIF INSERTING
    THEN
        Lcaction := 'I';
        Lnsid := :new.Sid;
        Lnlname := :new.Lname;
        Lnyob := :new.Yob;
    ELSIF UPDATING
    THEN
        Lcaction := 'U';
        Lnsid := :new.Sid;
        Lnlname := :new.Lname;
        Lnyob := :new.Yob;
    ELSIF DELETING
    THEN
        Lcaction := 'D';
        Lnsid := :old.Sid;
        Lnlname := :old.Lname;
        Lnyob := :old.Yob;
    END IF;

    -- store the CDC information
    UPDATE Cdc_stats A
       SET A.Action = Lcaction, A.Action_date = SYSDATE
     WHERE A.Sid = Lnsid AND A.Lname = Lnlname AND A.Yob = Lnyob;

    -- if the original row isn't in the CDC then insert
    IF SQL%ROWCOUNT = 0
    THEN
        INSERT INTO Cdc_stats (Sid,
                               Lname,
                               Yob,
                               Action_date,
                               Action)
             VALUES (Lnsid,
                     Lnlname,
                     Lnyob,
                     SYSDATE,
                     Lcaction);
    END IF;
END Cdc_stats_t1;
/

[Updated on: Fri, 23 August 2019 09:16]

Report message to a moderator

Previous Topic: Dynamic Logic
Next Topic: Help with sql using analytical functions.
Goto Forum:
  


Current Time: Fri Mar 29 01:17:13 CDT 2024