There is sometimes a problem of wanting to remove data (email addresses in this example) from within a string which are delimited.
For example, if you want to remove all non bybox email addresses from "some.name@bybox.com; simon@hicrest.net; fred.bloggs@bybox.com"
and do this for every table, without having to create functions to break apart the string first, how are you going to do it?
Here's how:
Our example table looks like this
CREATE TABLE data_export
(
data_export_id INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
email_address VARCHAR(255) NOT NULL
-- ... Other fields left out for brevity
)
Insert some test data
INSERT INTO data_export (email_address)
VALUES ('some.name@bybox.com; simon@hicrest.net; fred.bloggs@bybox.com'),
('neo@matrix.com; me@bybox.com'),
('fred@b.com; xxx@bybox.com'),
('fred@bbc.com'),
('an.other@bybox.com')
Next is the recursive CTE SQL is in several sections:
- split_by_delimeter - Breaking apart the string by delimeters.
- just_bybox - Keep the @bybox emaill addresses.
- distrinct_set - The ID's we want to update.
- STUFF - The rebuild section is the final select/update statement which containst the STUFF keyword.
;WITH split_by_delimeter
AS (
SELECT data_export_id,
email_address,
CHARINDEX(';', email_address + ';') AS n,
CAST('' AS VARCHAR(255)) AS result
FROM data_export
UNION ALL
SELECT data_export_id,
SUBSTRING(email_address,CHARINDEX(';',email_address)+1, 255),
CHARINDEX(';', email_address),
LTRIM(RTRIM(SUBSTRING(email_address, 0,
CASE WHEN CHARINDEX(';', email_address) = 0
THEN 255
ELSE CHARINDEX(';', email_address)
END)))
FROM split_by_delimeter
WHERE n > 0
),
just_bybox
AS (
SELECT data_export_id,
result
FROM split_by_delimeter
WHERE result <> ''
AND result LIKE '%@bybox%'
),
distinct_set
AS (
SELECT DISTINCT data_export_id
FROM just_bybox
)
-- For checking
SELECT data_export_id,
STUFF((
SELECT ';' + result FROM just_bybox y
WHERE y.data_export_id= x.data_export_id
FOR XML PATH('')
), 1, 1, '') AS email_address
FROM distinct_set x
/*
-- For actual update
UPDATE data_export
SET email_address = STUFF((
SELECT ';' + result FROM just_bybox y
WHERE y.data_export_id= x.data_export_id
FOR XML PATH('')
), 1, 1, '')
FROM distinct_set x
JOIN data_export
ON data_export.data_export_id = x.data_export_id
*/
This returns the following results:
data_export_id | email_address |
1 | some.name@bybox.com;fred.bloggs@bybox.com |
2 | me@bybox.com |
3 | xxx@bybox.com |
5 | an.other@bybox.com |
Perfect!