I read that I can put 2 IPs in a single A Record
No, you can create multiple A (or CNAME, TXT, MX) records with the same name and different values.
I expected that if VM IP1 was powered off a client could resolve to IP2, but that didn't happen. When I do a ping for "sql.domain.local" it always resolves to IP1, despite the fact that this VM is off
When multiple addresses are presented for a given name, the client should try them in turn. This is described in RFC 1794. Ping is a low level diagnostic tool; I would need to do some significant research to determine if he behaviour here is deliberate, anachronistic or merely defective.
Browsers work very differently - Round robin DNS (rrDNS) is a very effective tool for supporting high availability for HTTP[s] services. But that's because they implement failure detection with much shorter timeouts (<1 second) than other TCP clients. The default TCP configurations on most Operating systems have a failure detection timeout of 5 minutes or more. This also pre-supposes that the TCP client is properly RFC compliant. It's my experience that Java (or perhaps the application code running on top of Java) does not handle DNS resolution as expected.
An expensive alternative for providing HA access for external clients is via TCP multi-pathing. IME with 2 different providers, failover detection/switching took at least 3 minutes and sometimes didn't happen at all.
While it is a great solution for providing high availability to external clients I would not use rrDNS as a means of providing high availability for connections between nodes inside a given infrastructure.
but I don't want to put a public IP on my SQLs to use External Load Balance
Not exposing your DBMS servers on a public address is sensible. It doesn't mean you can't connect them via other means. Indeed if you have transactional data on your DBMS then you really, REALLY need to be able to ensure communications between the database nodes. If thats available via vnet peering and your application does not support an initrinsic HA client capability, have a look at haproxy or ProxySQL.
OTOH you may find that your application is somewhat sensitive to latency between the application server and DBMS (e.g. if using trivial ORM). In which case allowing an application server in location 'A' to connect to a DBMS in location 'B' will NOT be desirable - here rrDNS to isolated stacks can partially solve the problem but you need to also think about session management and data replication duriong failover/failback.