98.7% Of all customers recommend us, we're so confident about our results we publish all reviews and stats
View Live Stats View ReviewsForum home » Delegate support and help forum » Microsoft Excel Training and help » Excel data mapping/cleansing
Excel data mapping/cleansing
Resolved · High Priority · Version 2016
Amy has attended:
Excel Advanced course
Excel Advanced - Formulas & Functions course
Excel data mapping/cleansing
Hi,
I'm doing some data categorization for a client, and have a spreadsheet where there are a list of supplier names (nearly 9000), but some of the names are repeated throughout with slight variations e.g.:
A&L Dent
A&L Dent Solutions Ltd
AA Garage Service MB
AA Garage Services MH
someone has done some cleansing to get final "cleansed supplier name" column, so in the case above the adjacent column now reads:
A&L Dent
A&L Dent
AA Garage Service
AA Garage Service
However some of the cleansing has not worked effectively, e.g.
4 D Distributors
4D DISTRIBUTORS
Has been cleansed to:
4 D Distributors
4D Distributors
I'm trying to find these missed ones that haven't been cleansed effectively so that there is only one naming convention for each supplier in the "cleansed column".
I have no idea how the person mapped the supplier names in the first place as all formula's have been removed.
Is there a way for me to ensure that the cleansed supplier names are correct and are mapped properly to the original names?
Thanks
Amy
RE: Excel data mapping/cleansing
Hi Amy,
Thank you for the forum question.
Du you have a list of how the names should be?
Is this a repeating task or something you need to do one time?
Is there a pattern in the mistakes Excel can use to get it right?
If there is a pattern, it could be that some typed UK and others United Kingdom. Then you can tell Excel automatic to replace United Kingdom with UK.
If you have a list of the client's names spelled correct, you can merge the data in Power Query and if you use a anti left/right join Power Query can create a list of all misspelled client names.
You can also download the fuzzy lookup Add-in to Excel. Then you can set a threshold for accuracy. You can set a percentage. If it 90% the same map them.
Kind regards
Jens Bonde
Microsoft Office Specialist Trainer
Tel: 0207 987 3777
STL - https://www.stl-training.co.uk
98%+ recommend us
London's leader with UK wide delivery in Microsoft Office training and management training to global brands, FTSE 100, SME's and the public sector
RE: Excel data mapping/cleansing
Hi,
Thanks for your response, in response to your questions:
I don't have a list of how the names should be.
It's a one time task.
I can't identify a pattern in the mistakes. (In the example above it was the spacing between the number and letter)
I just have one list/column of names with numerous variations throughout that column and I want to end up with a new "cleansed" column/list with no variations, just one naming format for each name. So if I filter that supplier I just have one name to select from not a number of variations.
I'm not sure what was done to get the current cleansed list, but maybe I could start fresh and see how it works out? I'd appreciate any advise on how to do that please?
Many Thanks
Amy
RE: Excel data mapping/cleansing
Hi Amy,
If there is no pattern in the errors it can be very difficult to do. You may be able to do it with a combination of different methods.
If you create a column and use the Left function. If you have have the client names in column A you can in the new column write =Left(A2, 8). This will extract the first 8 characters from the client name.
A&L Dent = A&L Dent
A&L Dent Solutions Ltd = A&L Dent
AA Garage Service MB = AA Garag
AA Garage Services MH = AA Garag
But this method will not help you if you have the problem within the first 8 characters.
There is no functions in Excel which can do cleaning if there is absolutely no pattern. I am almost sure that who ever did the cleaning created a list of how he wanted the suppliers name to be and in the "cleaned" column the person used a Vlookup approximately match.
Kind regards
Jens Bonde
Microsoft Office Specialist Trainer
Tel: 0207 987 3777
STL - https://www.stl-training.co.uk
98%+ recommend us
London's leader with UK wide delivery in Microsoft Office training and management training to global brands, FTSE 100, SME's and the public sector
Tue 29 Sep 2020: Automatically marked as resolved.
Training information:
See also:
Welcome. Please choose your application (eg. Excel) and then post your question. Our Microsoft Qualified trainers will then respond within 24 hours (working days). Frequently Asked Questions
Any suggestions, questions or comments? Please post in the Improve the forum thread. |
Excel tip:Generating Random NumbersTo generate a random number in Excel use the = RAND() function. |