step 1 might not be necessary; a generic dataset should be able to read Excel's xml if it is just a single column