Removing Hidden Characters in SQL Tables
You probably know what I mean.
Rows get split in the middle of a field which then shifts everything after it into the following field. Suddenly you have street names in your postcode field and created dates in your notes field.
You look back at your query and see it is displayed correctly in SSMS.
You scroll through the whole dataset and see no issues at all.
Did it become corrupted during the copy and paste?
Does Excel have an issue that needs a hotfix to be applied?
Is Excel or SQL broken?
The answer, of course, is no. There is nothing wrong with Excel or SSMS; they are doing precisely what they were designed to do.
The problem is in the data itself.
Applications don’t just store the text you can see. To display it correctly, they also store hidden characters, which the application uses to remember the formatting of that text.
There are quite a few hidden characters that can cause issues.
You can google for a complete list of them, but the three that seem to cause the most problems are CHAR(9) which is TAB, CHAR(10), which is LF or Line Feed and CHAR(13), which is your CR or Carriage Return.
So what are these mysterious characters, what do they do, and how can I remove them, so they don’t mess with my data export to Excel?
I’m glad you asked, as I recently had to do the same task.
As I have seen this issue before, it was a quick fix, and the BA (Business Analyst) was delighted with the fast solution – so here it is.
The solution to the SQL Tables problem
SELECT REPLACE(REPLACE(REPLACE(fieldname, CHAR(13), ”), CHAR(10), ”),CHAR(9),”)
This replaces the TAB, CR and LF from that field with nothing. You are removing them from your dataset and allowing you to copy and paste the results into excel for your user.
Note that you must do this for all the fields causing your data to shift areas in the dataset.