FlexRadio 6600 with Maestro And Early MTU Problems.

Update: Flexradio says this issue was fixed with SmartSDR release 3.3.29 in April, 2022 but when a test was done it looks like this problem is still there. Same with 3.3.32.

Remember this number: 1478. Another way to put this is “The maximum segment size should be set appropriately for the network path between the two endpoints, so that no segment has to be fragmented”. ( to be explained later)

Mark this day, February 29, 2020, which is leap day.  A new FlexRadio 6600 transceiver was ordered today and that’s a big leap.  The advertising seems to emphasize both FT8 and remote operation.  What could be better?

6600M_Reduced

Actually this is a picture of the 6600M meaning it has the Maestro front panel builtin.  A 6600 and a Maestro were ordered separately so it can be used remotely.  It’s a big step breaking out of the Icom juggernaut.  The plan is to see which radio performs better and sell the other.  Beating an IC-7610 will be a big challenge.

The 6600 arrives! The Maestro is on back order. The 6600 is even more beautiful in person. A perfect project for the Covid lockdown using a pc and SmartSDR while waiting for the backordered Maestro. Now to see it perform.  First tests are to feed a signal to WSJT-X and count FT8 decodes.  The antenna here at home is hookup wire with a tuner which is not equivalent to the 43′ vertical at the remote base but it’s good for testing.  Evaluating a few passes shows fewer decodes and poorer SNR just as expected.  It’s not a real test until both radios are on the same antenna but it does demonstrate how easy it is to get up and running on FT8.  The challenge begins.

Testing at home before the journey to the remote site.

60755477962__7D0054A0-3747-44A2-B04B-EFC0BC1E9014

Installed at the remote site and operating, squeezed between a battery and the 7610.

IMG_1179

The installation went 99 per cent smoothly with only one issue remaining.  That one remaining issue turned into a 5 week nightmare to troubleshoot.  Issue: The 6600 cannot be reached over Comcast Xfinity internet service. It cannot be accessed over the Internet if the Internet provider at the client end is Comcast Xfinity.  Access works when the client is using a AT&T mobile hotspot. It also works when the client is using an iphone hotspot on AT&T.  Flex support says Xfinity has implemented a new security feature called Advanced Security and that might be the issue.  An attempt is being made now to work on this issue, so far with little luck and much frustration.

All indications point to Xfinity being the problem so a new modem was ordered which is XFi capable.  The existing modem is too old to have that new Xfinity security feature.  Modem is scheduled to arrive this week.

Followups:  Arriss model TG1682G modem purchased from eBay arrives.  The new ebay modem worked great for Internet access but it did not provide the feature needed which is XFi.  Working with Comcast technicians over the phone it was finally disclosed that the XFi feature is only available on Comcast supplied modems.  A Comcast modem was ordered, arrived, and activated.  XFi works now and Advanced Security was turned on and off.  No joy.  Same results. The 6600 cannot be reached.  A Flex community forum post said someone had the identical problem.  He said the old modem, an Arriss 862G, worked but his new one didn’t.  An Arriss 862G was immediately ordered from ebay and it’s arrival is being anxiously awaited, as of April 24.  Meanwhile, Flex still has an open ticket but it is on hold til all the gyrations are exhausted on this end.  The AT&T Hotspot still works and is being utilized as a temporary alternative.

Followup May 10, 2020.  The Arriss 862G produced identical results.   No joy.

A vpn (virtual private network) was set up as a test on Comcast Xfinity and it works (ExpressVPN).  Go figure. That is a second work around along with the AT&T mobile hotspot but not a solution.  Dan Quigley, the manager of support at Flexradio became involved. He looked at traceroutes and theorizes where there is a problem in the network.  He is comparing good traceroutes to bad traceroutes.  Zayo is the provider at the destination and all connections work once they get to Zayo (except clients with Comcast).  They also work if they go through the last couple of Comcast routers, the highest of which has the host name or dns of “910fifteenth”.  That is the address of the “Internet Hotel” in downtown Denver, which is a peer location to handoff Internet service to multiple providers in the region.  The router directly above that is “1601milehigh”.   That is the address of the small Comcast building in the parking lot of Mile High Stadium. Dan thinks that might be where some packet filtering is going on.  Contacting Comcast Advanced Repair was fruitless and frustrating.  Very much no joy and no progress.

Meanwhile a network probe was installed at the Strasburg end to see if any packets are by chance actually reaching the Flex.  They ARE. So the packets are not getting blocked or filtered by Comcast apparently. The Wireshark sniffer shows what is happening. When a setup packet comes into Strasburg the Flex responds. An oversized TCP packet is being generated by the Flex but only with Comcast at the client end.  The size of the packet is 1514 bytes and the MTU (maximum transmission unit) in the remote router is 1492.  An error shows up on wireshark coming from the remote router which says the packet should be fragmented.  Why is the Flex not fragmenting packets and why is it only when we are using Comcast?   The VPN also uses Comcast and it causes no problem.   Why?    This information has been forwarded to Dan Quigley at Flex and his ideas are being anxiously awaited.  No response from Flex in three days. (Or ever)

Followup:  Success!  We figured the problem out on our own after “putting in the work” as they say. It is solved by a simple parameter change on our own equipment.  A connection from a client using Comcast with no VPN works perfectly now.  Audio is not choppy.  Full panafall display works just as it should.  CW is smooth.  All is good.  The simple and free change?  Reducing MTU on the client router from 1500 to 1478.  That’s it.  Ready for the long and tedious explanation? Here goes.

Changing the MTU causes the pc to send a SYN packet with a MSS (maximum segment size) specification of 1438.  By the rules of the Internet (RFC 1323) the Flex is therefore obligated to send no packet with a payload bigger than 1438 bytes.  The Flex adds on the header overhead of 54 bytes, which can’t be reduced, for a total of 1492.  Bingo.  The MTU of the remote router is 1492.  That number is prescribed because the Internet service uses the protocol of PPPoE which is limited to 1492.  Here is the key to the original failure which requires a more detailed explanation. Overwhelmed yet?

The Flex was generating packets that were 1514 in length and the remote router was correctly dropping them and sending back an error message.  Unfortunately Flex has ICMP turned off which means it was never getting those error messages.  The Flex would retry the oversized packet several times and give up.  A connection request always timed out and failed.  Why did other Internet carriers work?   The answer lies in the MTU of those networks.  Just by luck the MTU’s somewhere deep in their networks were always 1478 or smaller.  The client PC always performs a Path MTU Discovery as part of beginning a connection for the purpose of discovering the smallest MTU deep in the network. This detail was researched from a Cisco white paper. Windows does a Path MTU Discovery to find out what the smallest MTU is in the network. Windows uses this information to avoid causing fragmented packets. The PC inserts that number in a SYN packet telling the far end host how big the host’s MSS, or payload, can be.  At that point the host, the Flex, would generate packets no bigger than 1492 and therefore would not overflow the remote router.   Connection successful.

How was this solution figured out?   It has apparently not been written about in the Flex community forum based on multiple searches producing nothing.  Comcast was no help in network troubleshooting.  Flex’s only response has been to direct us in the wrong direction, the network.  Certainly Flex has added to the problem by having ICMP turned off which caused it to not even receive the error messages.  If it had received the errors it possibly could have adjusted the packet size on the fly and fixed the problem without intervention.  But that’s just speculation.  No doubt Flex has a good reason to have ICMP turned off.  Here is one good explanation which also includes an easy way to turn it back on if one had shell access.    https://www.thegeekstuff.com/2010/07/how-to-disable-ping-replies-in-linux/

Again you asked, how was this solution figured out?  It came to mind after reading a white paper from Cisco about how Path MTU Discovery works.  Learning about MSS was also key in understanding the problem.    https://www.cisco.com/c/en/us/support/docs/ip/generic-routing-encapsulation-gre/25885-pmtud-ipfrag.html

The big puzzle was why the Flex was sending packets that were too large.  How did the Flex determine the size of those packets? A little digging provided the answer:  MSS or maximum segment size.  MSS is the payload size.  By the rules the host can vary the size of a packet by changing the size of the payload as long as it’s not bigger than the MSS.   The host cannot exceed the MSS. A header with a fixed size is wrapped around that payload.  Next logical question is how is MSS set?   A little more digging provided the answer:    A client PC runs Path MTU Discovery and sends a SYN packet to the remote host which includes the MSS calculated from that Discovery.     Consider this.  The Comcast network could be so good that all their routers have the maximum MTU.   Therefore the PC could be sending a SYN packet with a MSS that is ok for Comcast but too big for the network at the remote site (remember it uses PPPoE protocol which has an MTU of 1492).  At that point it was easy to use Wireshark to show what the actual MSS was in the SYN packets.  Indeed it was too large.  Next question is how could the MSS be reduced?  The answer lies in the fact that Path MTU Discovery looks at all the MTU’s in all the routers along the path including the router at each end.  It then uses the smallest MTU found to calculate MSS.  Aha. Maybe if the router at the client end had the smallest MTU the Path Discovery would calculate a small MSS.  Changing the MTU on the router at the client end was tried.  The MTU was manually set to a number that caused the MSS to be small enough not to cause an error when the Flex generated it’s reply.  That MTU number is 1478.  MSS is calculated by the machine to be 1438.  The fixed header is 54 bytes which is wrapped around the payload of 1438 bytes for a total of 1492.   Joy.  That 1478 is the MTU of the client router and it is happy. Problem solved.

Sidebar:  Why did some networks work and not others?  The Internet service at the remote site is provided by a wireless ISP which uses a protocol of PPPoE.  PPPoE and ADSL both add 8 bytes of overhead to the packet headers which reduces the MTU by 8 bytes.  The customer only gets to use 1492 instead of what is standard for most Internet which is 1500.  That is most likely why the connection to the Flex also failed when we tested it on Century Link.   Century Link uses ADSL which also limits the MTU to 1492.  As for why it works on VPN the answer probably is in the MTU of one of the routers used along the way by the VPN.  Same possibility with the AT&T Mobile Hotspot.  Some hop has a small MTU, which results in a smaller MSS, which results in the Flex sending smaller packets, which results in a good connection.

Trying to help others from suffering this is the post made to the Flex Community Forum:

Can’t connect to Flex using Smartlink over Comcast but works ok with other providers?

The post included a very short synopsis and solution in the hopes of helping some one out there.

January, 2022: New router and upgraded Internet service make MTU problem raise it’s ugly head again. This time the problem is caused by not being able to change the MTU on the router. First new router, ASUS AZ86U, only allows the MTU to be changed if the Internet connection type is PPPoE. That router was exchanged for Netgear Nighthawk RAX120. Changing the MTU is supported but it doesn’t work. Changing the MTU parameter and applying makes no difference to what the router actually transmits. Netgear Case number 45573570. TPLink AX6600 works fine.

Workarounds.

At least two workarounds are possible. The first one is to add a second router in series with the PC or Maestro and set the MTU on the new router.

The second workaround uses the command line to do network shell routines. One routine can change the MTU on the PC. Open cmd with administrator permissions and enter these commands.

>netsh

>interface

>ipv4

At the ipv4 prompt enter the following command.

>set subinterface “Ethernet” mtu=1438 store=persistent

An “ok” will be returned if this works. To test, quit the ipv4 shell and enter the following command:

>netsh int ip show int

The results returned will show the current MTU.

Note that the magic number may be lower or higher. Experiment, if the one above doesn’t work.

A third workaround could be to find a router with an MSS Clamping feature. That has not been tested here.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s