If you use a computer, then you probably interact with documents and files often. You however may not pay much attention to the formats. Document file formats play a big part in determining how a computer opens a document and how that document will function.
Your computer probably has a default setting that can make it possible for you to open, edit, and save documents without worrying about what format it may be in. This can go on until you decide to email the document and the recipient may let you know they cannot view the file because of the format it is in. That is when knowledge about different file formats may become critical, you need to know them as well as extensions.
Document: You may be used to referring to a document like a written copy of the information. When it comes to computers, we use the Word document to describe any file created using software. This could be text, video, image, and audio.
File: A file is where documents or any data is saved on a computer. When you create a document, you will save it as a file and give it a file name. So a document can be referred to as a file once you save it.
Format: This is the way data in a file is saved and encrypted. Think of it as the language of the file or document. For example, a word document would be saved in DOCX format. That is the language that the document will communicate in. Particular data needs to be saved in its unique format.
Extension: An extension is simply what identifies the format a file is saved in. For example, if your filename is tom when you save it, your computer will automatically add the extension docx so that you will have the file saved as “tom.docx”. that will help the user know the format of the document. Please understand, however, that if you changed the file name to tom.pdf, that will not change the format. Changing the file extension doesn’t convert a file into a new format. We shall get into that later.
Throughout this guide we will be referring to the above terms, so it is good to take them in before proceeding to the rest of the guide.
Now let us look at some of the common file types:
There are numerous types of files, but some get used more often than others. You have come across many of the file types in this list, but you may also get to know new ones.
Text Files: Typing is probably the most common thing people do on a computer and whatever they type is saved as a text file. Your word document would be an example of a text file once you save it. There are many formats for txt files which we shall see eventually.
Image Files: There are various image file types you may have interacted with. Some are heavier than others and there are image files that are suitable for particular applications like on websites, while others are compressed for easier storage.
Audio Files: If you have saved music on your computer or other music devices, you have had to save an audio file like MP3 or Wav. These are the common audio file formats.
Spreadsheet: If you work with figures a lot, then these are files you would be familiar with. Microsoft Excel files are the most common file formats for spreadsheets.
Video File: Videos are saved in very many ways and you may encounter several video file types as you download videos online. The common types however include a Windows media player, MP4, Adobe Flash, and MPEG.
As we said, there are several file types. The above mentioned are the most common and they can give you a fair idea of what they are. For the different file types to function as they should, they depend on the document formatting. Below are some of the common types of document formatting.
From the definitions above for file format and extensions, you must have realized that the two are related. While we mention the common formats, we will identify them by their extensions, so this may serve as a list of file extensions as well.
.Doc: This was the earlier format for text-based files used by Microsoft. You can still use this format for files although there is a new option that is more commonly used. When you save a word document, you can choose to save it in DOC Word format but certain programs may fail to open the document.
.Docx: With more than 1 billion people using Microsoft Word, Docx is the most shared Word file format. This is the extension for the Microsoft Word document format. It replaced .doc and can be read by different programs. This format is a compressed archive for many files with added files that contain stylesheets and more information. Primarily it is an XML based file. These files however cannot be read with a text editor although you can unzip them and then inspect the files that make it up. There are other applications other than MS Word that can create DOCX files. Open Office is one of those applications and so is LibreOffice.
DOCM: These documents are not so different from docx files only that they contain embedded macros code. The code helps to automate docx files. Macros are particularly helpful when executing repetitive tasks in Word like data entry.
.TXT: These are purely text documents and have no formatting added to them. A typical TXT document would be used for taking down notes. Many programmers use them for writing code or instructions. Just about any program can open and read this format. On your PC, you can create these files using Notepad for Microsoft and Apple TextEdit on Mac.
.HTML: Now, meet the language that is also a file format. HTML stands for Hypertext Markup Language. It is used for Web Pages as a language and as a plain text file format. You can view this format on a plain text editor like Notepad which will show you certain features of the text that are commands and you cannot see them on an actual web page. HTML format is behind the scenes text that gets websites to work as intended.
.PDF: If you have been wondering why most people will request PDF format, well it is because this format can be viewed in any environment. It is a Portable Document Format which makes it appropriate for sharing any file. A PDF file will display the document exactly as the creator intended it to appear. It is also print-ready. PDF documents however cannot be edited since they are read-only. You can use various applications to view a file with the .PDF extension although Adobe Reader would be the standard choice.
PPT and PPTX: People who create presentations often are familiar with these file extensions. PowerPoint presentation files are usually saved in PPTX format. This format supports text, image as well as video and can be opened in various programs although Microsoft is the default program for PPTX files.
PPTM: PPTM documents can be used to automate certain functions within a presentation. These documents are just like PPT files but they contain Macros which are embedded instructions that determine how certain tasks are executed by simply pressing a single button.
XLS and XLSM: If you are dealing with files that contain a lot of data especially in table or graph form, then you will encounter files with these extensions. The XLS and XLSM extensions are commonly used on the Microsoft Excel spreadsheet. There are however many other programs that can create, edit and save these files.
XLSX: Although originally designed for Microsoft Excel, XLSX files can be opened using any spreadsheet app. The files are stored as Zip files that are used to open the document. Data in these documents is stored in columns and rows this is convenient when dealing with figures.
XLSM: These files are spreadsheet documents with embedded Macros. Xlsm files contain a set of instructions that help to automate spreadsheet documents. Developer tools are needed to record macros on spreadsheet. They help users work faster by automating repeated actions.
CSV: These files make it easier to export and import data files. CSV files are usually files that contain data with a lot of commas. The best program to use for this would be Notepad++ especially if the data is big. Many programmers use this format to store their code. You can also use it to save phone contacts that you would like to export.
ODT: This is an alternative to the DOCX file. It can be used for text, objects, images, and styles. ODT files are open document text files that can be created with any word processing files. It is commonly used by free document editors. If you do not have a program that can open these types of files, you can convert it by saving it as docx.
BMP: This is a Bitmap format that is used to store a map of images. This file stores all the color information for an image. By storing this image data, the document will maintain the image resolution even when transferred. The size of bitmap files however makes it a hard format to use often.
The list of document formats and their extensions is constantly growing and it would be impossible to exhaust it. Let's move on to the application of these different file formats.
DIB: Device independent bitmaps is a file format that can be used to save graphic files without a display device. Two dimensional images can be stored in this format. It supports different color resolutions and makes it easy to transfer images from one device to another.
GIF: Graphic interchange format is a portable bitmap file format that supports animation as well as several colors up to 8 bits per pixel. It is a format used mainly for graphics and logos but not advisable for photographs.
JPG/JPEG: This format of files is web friendly because it compresses digital information. It is not the best option for high resolution printing since it limits a lot while trying to maintain the size of the file.
PNG: PNG files are similar to JPEG, they are portable files used mainly for network graphics but unlike JPEG, they support transparent backgrounds. PNG format allows a user to save images with more colors and make the image sharper.
TIFF: Tagged Image File Format is a lossless format that allows images to maintain their original quality. For high resolution photography, TIFF format would be the best option to save files in. To preserve the quality of a scan image, this format is a good option.
Files with the ps extension are Adobe postscript files which are used by publishers to print text and images on the same page. Within the file script, a user can embed printing instructions. PS image format also acts as programming language.
When choosing automation formats, priority is given to the already existing program being used. In most cases, that program would be Microsoft Word or a similar Word processing program. The same MS Word format that you are using would be appropriate when automating. That would mean automated documents would either be DOCX format or DOC since these are the default formats for most users.
If you need to automate reports, you will probably be comfortable using Excel format since figures are more commonly managed on an Excel sheet. You can then choose any of the Excel extensions to save the document template for automation.
Although PDF may also be used for Word document templates, it is not an easy to use format. If you have ever tried to copy and paste a PDF document onto Word, you know the struggles that you would have. This is why docx would be preferred. If however, you are dealing with a document that has a lot of graphs, you would be better off formatting it as PDF. One of the solutions would be to use an HTML report template that can control the layout and content of a file. This, however, requires the involvement of someone with the knowledge of CSS.
When you are dealing with presentations, you may have a combination of document files, right from text to video, and all those in between. PowerPoint can support various formatting text and documents.
The format you choose will be dependent on the type of presentation you are making. For example, if it has a lot of figures, then you probably will use data from an excel document so it will need an Excel file format that is compatible with PowerPoint like .XML or if the presentation is comprised of many graphs, then PDF would be the right format.
However, the standard PowerPoint format would be PPTX which can open in various versions of PC programs as well as on a smartphone with the PowerPoint program installed. If you have images as part of a slide show, you can save them as jpg so that every slide is saved as an ordinary image.
Now, the following formats are very useful for sharing data but at the same time, they can be dangerous if you do not have a safety procedure to follow when dealing with them.
Executable files come in various types but you can identify them by their extensions that show the format. Once you click on this document file download, the PC will try to run the file. Usually, this will be no problem since executable file formats are used for auto-installers as well as apps. But there is a big problem with these formats and that is because they are also used by people trying to infect your PC with a virus.
The document formatting guidelines for executable files give them the same privileges an authorized user has and this means they have access to the entire system. If you have administrator privileges, then the file will have the same privileges and will upload whatever data has been stored in it. Hopefully, you can see how this can be useful when installing software or updating your antivirus, and at the same time, you notice how dangerous it can be.
The rule of thumb when dealing with this kind of document file format download is not to click on any executables that are sent from an untrusted source. Many attackers will give the virus a name that disguises it as a useful software or an app you might be interested in. If the source of such a file is not known, do not trust the download.
Executable formats include the following extensions:
JAR: These files are archive files that are compressed to include a manifest file for easy execution. JAR documents contain metadata as well as resources and are faster to download. A complete application and its classes can be deployed in a single request.
EAR (Enterprise Java app): Java EE uses EAR file format to package modules into an archive. These modules can then be deployed at the same time on the application server. Within the files, a user can store XML files that help in the deployment of modules.
Compressed formats are mainly used for transporting files. If a file is very large it can be compressed so that it is easy to send and when the recipient gets it, they can unzip it to access the data. Executable files may also fall into this category. Word documents in RFT format can be compressed since they tend to take up a lot of space.
To open files in the compressed format you may need special software for the particular file while others can be opened automatically. Most software updates can be sent as compressed files and once you download, it can install without the user having to open it.
Once again, you should be very careful when dealing with compressed files, although they are helpful, they can be used to send viruses. If you have automated your system, there may be software that can unzip compressed files without your involvement and that may wind up infecting the system.
Common compressed file extensions include:
Rar: A user can use WinRAR to create rar files as compressed archive files. Rar files are easier to archive and to access them a user has to extract them like a Zip file. They can be used on Windows as well as Mac OS.
Scripts are similar to executable files but they are written in a scripting language and they are not compiled the way executable files are. For files in this format like Postscript, you can use a text editor to open them and inspect its source code. These files can then be run like executable files. These are also a danger to your system. Be careful when dealing with scripts from sources you haven’t verified.
Advancement in technology makes it easy to deal with different documents. The process of choosing a file format can be done automatically based on the kind of document you are creating. It is also made easier when you use formats that can be opened and viewed by many programs. For example, DOCX, PDF, and XML are commonly used.
Since it is impossible to master all the formats right away, users have to keep learning through their experience once they understand the basics contained in this guide.
Bat: Bat files make it easier for a user to automate their documents. By embedding scripts in the file, you can achieve automation.
Reg: Use reg files to add and change values within windows registry. A user can use these files to backup files before making changes to the registry. Reg files can be used to make manual changes to files that have already been shared.
Windward offers products that automate documents and output them in a wide range of formats including HTML, PDF, DOCX, XLSX, PPTX, and a lot more.