Introduction to compile process
The way the compilation process works is by transforming a string (what's written in a file) into lexer tokens (words) and then creating an Abstract-Syntax-Tree (AST) representation of them in order to extract higher meaning (sentences), once the AST is created then an intermediate representation of the code is created, you can think about this as a generic assembly version of the code, different computer architectures have different assembly instructions, because assembly instructions don't have any inherent meaning to them based on the laws of physics, they are standards created based on whatever the creators of the interface between the hardware architectures, Operational Systems and end-users decided the instructions to be. After the intermediate representation is created the code is compiled and executed, I'm not 100% sure about the nuances of this part so I'm not going to talk about it here.
Grammar (Terminals and Non-Terminals)
The way programming languages are defined is based on something discovered by Noam Chomsky (Syntactic structures) which in a very simplified way means that it's possible to create rules that create language by defining something called a Terminal, which is basically the syntax of the language, for example: a digit or a number, and Non-Terminals, which are basically higher abstraction structures that will eventually become terminals, for example: digits and numbers combined, which eventually will become either a digit or a number. Over time people realized that regardless of which programming language was being created the process of going from the rules that define terminals and non-terminals to the Abstract-Syntax-Tree which is the data structure created to make meaning from tokens, is always very similar, so people created parser generators, which are programs that allow you to convert the rules of a Grammar into a parse tree, which is a data structure that will be converted into the AST. So a grammar has different rules that determine how non-terminal tokens will become terminal tokens, this process generates a tree data structure that can then be used to build sentences from the terminals based on the way the tree is traversed. Different methods allow for different ways to traverse the tree (always go to left node or right node for example) and this defines the structure of the grammar. So for example you can have the following structure:
In this case S would be a non-terminal token, E is another non-terminal and D is a terminal token, the rule determines that a token S can either be S, E or D, from that point the token S will be transformed into one of these options, the choice of which one will also depend on the rules of the grammar, this will generate the order of traversal for the final parse tree because it will determine how sentences can be constructed out of the based of the tree (terminal tokens) to the top (non-terminals) which are the higher level sentences of the language.
So in the example the token S becomes D through the path highlighted in green on the image, however it could become other terminal tokens through other parts of the graph. Once the parse tree is generated the order of traversal of the nodes will determine which sentences are formed, in this case the traversal would go from the D (terminal token at the base of the tree) to T, then E and finally S, however if different traversals were made different sentences would be created, the syntax rules determine the shape of the tree data structure and grammar rules determine the order of traversal, combined they will create the grammar. So the fundamental difference between different programming languages is how the grammar and syntax rules shape the tree data structure that creates meaning, this process of encapsulating data/information into a symbolic representation (letters or numbers) determines how memory management will work, because they are essentially the same thing, when you encapsulate information for example zeros and ones that define impulses on the hardware into a symbol that will represent what those zeros and ones mean for example the letter A, you are as a by product defining how memory is shaped, because memory is the thing that is going to store that symbolic representation into hardware as binary impulses, so you can think of memory as being the gravitational force that a network creates in order to retain it's symbolic representation of what data/information means. And that is the fundamental relationship of spacetime that determines computation, space is impulse, time is symbols, the gradient between the two creates geometry/information.
Probably the best videos to understand how data (zeros and ones) become symbols is the Ben eater's series where he shows how combinatorial logic is used to create the symbols with an EEPROM and a hex decoder.
Conclusion
Programming languages are determined based on their grammar rules, which create the relationship between terminal and non-terminals symbols, the process of compilation is the conversion of an input (usually a string from a file) into tokens then abstract syntax tree, intermediate representation and finally byte code. The data structure that defines grammar highlights the relationship between data/information (impulses) and symbols, grammar structure defines the geometrical shape of the symbolic representation of a network that creates computation, embedded in this relationship is the way the language is going to store and use memory because memory is just the gravitational force of how impulses are stored in order to retrieve higher meaning later on based on the data structure that defines the rules of the grammar for that language/communication or coherence on a network of nodes that creates computation on space time (big O notation).