3 Ways to Convert Pdf to Excel Without Losing Formatting

Step by Step to convert your scanned PDFs to Excel without losing formatting.

Sep 21, 2023 | Read time 8 minutes

Introduction

A common problem when converting PDFs to Excel is that you lose some of the formatting.

It could be that the columns are out of whack, nested tables do not have the right number of columns and rows, dates and numbers are not correct.

Out of my own frustration when converting PDFs to excel, I have put up my findings on how to get the best results.

In this post I will compare 3 ways we can convert PDFs to Excel without losing formatting.

Formatting issues when converting PDFs to Excel

  • Columns and rows in the generated Excel file might not correctly align with the content in the original PDF file.
  • The text might not wrap correctly in cells, leading to overlapping text or text getting cut-off.
  • The original fonts, styles, or colors might not be preserved, leading to differences in the appearance between the original and converted files.
  • Tables might not be recognized correctly, leading to misaligned data in the Excel cells.
  • Number and date formats might not be correctly translated during the conversion, leading to data inconsistency.
  • Active hyperlinks in the PDF might not be preserved in the Excel file.
  • Merged cells in PDF might not translate well to merged cells in Excel, leading to a distortion in the layout.
  • Special characters or Unicode characters might not be displayed correctly in the converted Excel file.

Option 1 - Using PDFDodo (⭐⭐⭐⭐⭐ / 5)

We can use PdfDodo to convert our Pdfs to Excel with out the loss of formatting. This would be my go to option since the converter is versatile and works on many different PDF types - eg bank statements, financial reports, etc.

To do start the CSV conversion, we go through the following steps:

  • Step 1: Get an account with PdfDodo
  • Step 2: Go to the home page and click “Upload Document”

  • Step 3: After the document has uploaded - click “Start Analysis”. This will usually take a few seconds.

  • Step 4: Click on the uploaded document link to see the results.

After PdfDodo has completed analysis, it will list out all the data that it thinks is in a table format.

Now we can see our result and download the Excel or CSV file:

As we can see it correctly matches the columns and rows - even the row with multiple lines!

After we have our Excel file - we can do things like formulas - getting the SUM or even generating a nice Pie Chart of our expenses.

Option 2 - Using Microsoft Word (⭐⭐ / 5)

Though not perfect, you can try using Microsoft Word as a bridge.

I would rate this option 2 stars.

Since Word 2013, you can mess around with PDFs directly in Microsoft Word? It’s pretty handy for tweaking a PDF or using it with other Office tools.

Here’s how you can do it:

  1. Fire up Microsoft Word.
  2. Hit “File” up in the top-left corner.
  3. Click on “Open”.
  4. Find where your PDF is saved.
  5. Select your PDF and hit “Open”. Just a heads up: Word will create a copy while converting it, so your original PDF won’t be messed with.
  6. Click “OK” to kick off the conversion.

After the conversion, feel free to modify it just as you would with any other Word document.

Word does a ok job of retaining the PDF’s original layout, but there might be a few elements that rearrange awkwardly.

For example, in the below - we can see that the table has been split between the pages:

Pros and Cons

  • The conversion process can be somewhat unpredictable; occasionally, Word may fail to convert content, leaving it as an image that requires manual input.
  • There may be inconsistencies in column arrangements, such as receiving two columns when you expected four.
  • Users of Microsoft Word (2013 or newer versions) have an in-built feature to facilitate PDF conversions, eliminating the need for extra software.
  • Those acquainted with Word will find the conversion process to be uncomplicated and cost-free, setting it apart from some dedicated PDF conversion - utilities.
  • Being offline nature of the conversion ensures the confidentiality of your sensitive documents.
  • In some instances where it could not convert - it just leaves a image instead. So you will need to type it out manually!

Option 3: Using Adobe Acrobat (⭐⭐⭐⭐ / 5)

First, you will need to convert your PDF to an Excel file format, as Excel files can easily be converted to CSV.

There are several online tools and software applications that can help you convert a PDF file to Excel. Here are a couple of methods:

  • Open the PDF file in Adobe Acrobat (not the free Reader).
  • Go to “File” > “Save As Other” > “Spreadsheet” > “Microsoft Excel Workbook.”
  • Save the file to your desired location.

Problems with using Adobe Acrobat

  • Encountered a hiccup when trying to shift data over to Excel, especially with documents that spread across several pages. It seems to treat them as separate tables, which can throw a wrench in the works when you’re dealing with something like a long list of bank transactions.
  • If you’re mainly looking to pull data from PDFs, the cost might give you pause. It’s a bit on the higher side, which might not justify the expense if that’s all you’re using it for.
  • A bit of a gray area when it comes to data storage; I couldn’t quite figure out where the data ends up. Reaching out to support hasn’t been smooth sailing either, so I’m left with some lingering questions there.

Tips and Warnings

  • Always review the converted document to ensure the formatting has been retained, as no conversion method is perfect.
  • Complex PDFs with images, tables, and other elements might not convert perfectly and might require significant manual adjustment post-conversion.
  • Be cautious while using online converters with sensitive or confidential information, as uploading such documents to a third-party server can pose security risks.

Final thoughts

Overall converting PDFs to Excel without losing formatting can be tricky. This will depend on the PDF that you are using.

Is it a scanned PDF or text based PDF?

The most effective option is to use PdfDodo. It allows you do convert the PDF to excel with scanned and text PDFs with little loss of formatting.

The accuracy is quite high and it supports multiple pages and password protected PDFs.

Another option is to use Microsoft Word if you already have that installed on your machine! The problem is that the conversion is not that great - you might end up with formatting issues.

Finally we can use Adobe software. However the issue is that it can be expensive!

👋 About the Author

G'day! I am Ken Ito a software engineer based in Australia. I have worked as a software engineer for more than 10 years ranging from roles such as tech lead, data analyst to individual contributor. I have experience in creating code (python, javascript) to extract data from PDFs for ETL and financial statement analysis for accounting purposes.

During those years, I have worked for Government departments, companies like NTT (Nippon Telegraph and Telephone), Bupa and not-for-profit organizations.

Follow along on Twitter and LinkedIn

See also

Tutorial on how to convert your MayBank Bank Statement PDFs to Excel format.

[HOW TO] Convert MayBank Statement to Excel

Sep 19, 2023

Tutorial on how to convert your MayBank Bank Statement PDFs to Excel format.

Best ways to convert your scanned PDFs to Excel format.

Convert ABSA Bank Statement to Excel

Sep 17, 2023

Best ways to convert your scanned PDFs to Excel format.

Steps to extract your CitiBank PDF bank statement to Excel

[RESOLVED] Export CitiBank Bank statements in Excel

Sep 15, 2023

Steps to extract your CitiBank PDF bank statement to Excel